Archive | AI/ML RSS feed for this section

How Azure AI Search Works as a Vector DB

7 Apr

Azure AI Search can function as a vector database search when configured with vectorization and semantic search capabilities. Here’s how it works and where the vectors are stored:

✅ How Azure AI Search Works as a Vector DB

When you set up Azure AI Search to import and vectorize data (e.g., documents from a selected folder in a Blob Storage container), it performs the following:

  1. Data Ingestion:
    • You define a data source (Blob Storage).
    • Azure Search pulls documents from the selected folder.
  2. Vectorization:
    • You can use built-in vectorization (via Azure OpenAI embedding models) or bring your own embeddings.
    • Each document (or chunk of text) is converted into a vector representation.
  3. Indexing:
    • These vectors are stored in a vector field within the Azure Search index.
    • You define this field in your index schema (e.g., contentVector).
  4. Search:
    • You can perform vector similarity search using cosine similarity or other metrics.
    • Combine it with a keyword search for hybrid search scenarios.

📍 Where Are Vectors Stored?

The vectors are stored inside the Azure AI Search index itself. Specifically:

  • Each document in the index has a field (e.g., vector) that holds the vector embedding.
  • Azure Search indexes these vectors and uses them for similarity search.
  • You can configure the vector field with parameters like dimensions, vectorSearchAlgorithmConfiguration, etc.

🧠 Example Use Case

If you’re indexing PDFs or text files from Blob Storage:

  • Azure Search will chunk the documents.
  • Each chunk gets vectorized.
  • Vectors are stored in the index.
  • You can then query using a vector (e.g., from a user query) to retrieve semantically similar chunks.

 

What is RAG (Retrieval-Augmented Generation)?

30 Mar

RAG = Search + LLM generation

Retrieval-Augmented Generation (RAG) is an LLM architecture pattern that combines information retrieval with text generation.

Instead of relying only on a model’s frozen training data, RAG:

  1. Retrieves relevant documents from an external knowledge source at query time
  2. Injects that context into the prompt
  3. Generates an answer grounded in retrieved data

Core Components

  • Embedding model → converts documents and queries into vectors
  • Vector store → performs semantic similarity search
  • Retriever → fetches top-K relevant chunks
  • Generator (LLM) → produces the final response using retrieved context

High‑level idea of RAG

RAG = Search + LLM generation

Instead of asking an LLM to answer purely from its training data, RAG:

  1. Retrieves relevant knowledge from your private data (documents, PDFs, wikis, etc.)
  2. Augments the prompt with that knowledge
  3. Generates a grounded, accurate answer

This avoids hallucinations and enables enterprise knowledge Q&A.

1️⃣ Indexing Phase (Offline / Preprocessing)

This happens before users ask questions.

Goal

Convert raw documents into a searchable vector index.


Step 1: Document ingestion & parsing

PDF / DOCX / HTML

        ↓ parse

      Plain text

  • PDFs, Word files, web pages, etc. are parsed into text
  • Parsing removes layout noise (headers, footers, images)
  • Output is clean text

✅ Why this matters
LLMs and embedding models operate on text, not binary formats.


Step 2: Chunking (critical design decision)

Text → Chunks (small passages)

  • Large documents are split into chunks (e.g., 300–1,000 tokens)
  • Often overlapping chunks (e.g., 20–30%) to preserve context

✅ Why chunking is needed

  • Embedding models have token limits
  • Smaller chunks improve retrieval precision
  • You retrieve just the relevant part, not the entire document

✅ Typical chunk strategies

  • Fixed-size tokens
  • Semantic chunking (paragraph / heading based)
  • Sliding window with overlap

Step 3: Embedding (semantic encoding)

Chunk → Azure Embedding Model → Vector

  • Each chunk is passed to an Azure OpenAI embedding model
  • Output is a high‑dimensional vector (e.g., 1,536 dimensions)

✅ What embeddings represent
They capture semantic meaning, not keywords.

Example:

  • “How to reset password”
  • “Steps to change login credentials”
    → very similar vectors

Step 4: Store in Vector Database

Embeddings → Azure Vector Store

Stored items typically include:

  • Vector embedding
  • Chunk text
  • Metadata (document name, page, section, timestamp)

✅ Azure options

  • Azure AI Search (vector + hybrid)
  • Cosmos DB with vector search
  • PostgreSQL + pgvector

✅ Outcome
You now have a semantic index of your enterprise knowledge.


2️⃣ Retrieval Phase (R – Runtime)

This happens when a user asks a question.


Step 1: User query

User → “How does leave approval work?”

Raw natural language question.


Step 2: Query embedding

Query → Azure Embedding Model → Query Vector

  • The same embedding model used during indexing must be used here
  • This ensures vector space consistency

✅ Important
Using different embedding models breaks similarity search.


Step 3: Semantic search in vector DB

Query Vector → Similarity Search → Top‑K chunks

  • Cosine similarity / dot product used
  • Returns most semantically relevant chunks

✅ Often combined with:

  • Metadata filters (department, date, access level)
  • Hybrid search (vector + keyword)

✅ Output A small set of relevant chunks, not documents.


3️⃣ Augmentation Phase (A)

This is the bridge between search and generation.


Step 1: Combine retrieved chunks

Relevant Chunks → Context

  • Chunks are:
    • Deduplicated
    • Ordered
    • Truncated to token limits

✅ Typical structure

Context:

[Chunk 1]

[Chunk 2]

[Chunk 3]


Step 2: Augment the user query

Prompt = System Instructions

       + User Query

       + Retrieved Context

Example:

You are an HR policy assistant.

Answer ONLY using the context below.

Context:

<retrieved chunks>

Question:

How does leave approval work?

✅ Why this is powerful

  • LLM is forced to ground answers
  • No reliance on model’s internal memory
  • Enables citations & traceability

4️⃣ Generation Phase (G)

This is where the LLM produces the final answer.


Step 1: Feed augmented prompt to LLM

Prompt → Azure OpenAI GPT Model

  • Model sees:
    • The question
    • The retrieved enterprise knowledge
  • It does reasoning + language generation

Step 2: Generate response

LLM → Final Answer

✅ Characteristics of RAG responses

  • Grounded in provided data
  • Up‑to‑date (depends on index)
  • Enterprise‑safe
  • Explainable (can show source chunks)

🔁 Why RAG is superior to fine‑tuning for enterprise data

AspectFine‑tuningRAG
Data freshnessStaticReal‑time
CostHighLow
Hallucination riskMediumLow
Source citationsHardEasy
ComplianceRiskyStrong

🧠 Key architectural best practices

  1. Chunk size matters more than model choice
  2. Use hybrid search (vector + keyword) in production
  3. Add metadata filtering for access control
  4. Keep prompt instructions strict
  5. Log retrieved chunks for observability

✅ Final mental model

Think of RAG as:

“Search engine + LLM reasoning layer”

  • Vector DB = semantic memory
  • Embeddings = meaning encoder
  • LLM = language + reasoning engine

Why RAG exists

ProblemRAG Solution
LLM hallucinationsGround answers in real data
Stale knowledgeFetch live or frequently updated content
Private dataKeep proprietary data outside model training
Cost of fine-tuningAvoid retraining models

Production Use Cases Commonly Implemented with RAG

Below are real, production-grade RAG use cases that teams deploy—not demos.


1. Enterprise Knowledge Assistant

Use case

  • Internal chatbot for policies, SOPs, wikis, Confluence, PDFs

How RAG helps

  • Retrieves policy clauses or documents
  • Answers with citations and source links

Production details

  • Chunking by semantic sections
  • Role-based access filtering at retrieval time
  • Caching frequent queries

2. Customer Support & Helpdesk Automation

Use case

  • Support bot answering FAQs, troubleshooting guides, and manuals

How RAG helps

  • Grounds answers in official docs
  • Reduces hallucinated instructions

Enhancements

  • Confidence thresholds → fallback to human agent
  • Query rewriting for vague user questions

3. Code & Developer Assistants

Use case

  • Query internal repositories, APIs, and design docs

How RAG helps

  • Retrieves relevant code snippets
  • Explains logic using actual implementation

Key technique

  • Repository-aware chunking (functions, classes)
  • Metadata filters (language, repo, branch)

4. Legal / Compliance Search

Use case

  • Contract analysis, regulation Q&A, audit prep

Why RAG is critical

  • Exact wording matters
  • Answers must be traceable

Production safeguards

  • Source citation mandatory
  • Retrieval-only mode for sensitive answers

5. Analytics & BI Natural Language Interface

Use case

  • Ask questions over dashboards, metrics definitions, and data catalogs

RAG role

  • Retrieves metric definitions before generating explanations
  • Prevents semantic drift (“revenue” vs “net revenue”)

6. Healthcare / Scientific Literature Assistants

Use case

  • Search clinical guidelines, research papers
    (for example, Care plans for people who require care from care staff so that the person needing care and the staff can ask questions about how to cope or manage certain situations)

Why RAG

  • Models cannot invent facts
  • Must cite authoritative sources

Controls

  • Strict context window limits
  • Generation constrained to retrieved text

Typical Production RAG Stack

Common tooling used in real systems:

  • Frameworks
    • LangChain
    • LlamaIndex
  • Vector Databases
    • Pinecone
    • Weaviate
    • FAISS
  • LLMs
    • OpenAI
    • Anthropic

What Makes a RAG System “Production-Ready”?

Key differences from toy implementations:

  • Advanced chunking (semantic, hierarchical)
  • Hybrid retrieval (vector + keyword/BM25)
  • Re-ranking models for precision
  • Observability (retrieval quality, answer grounding)
  • Security (PII filtering, ACL-aware retrieval)
  • Evaluation pipelines (faithfulness, relevance, latency)

Summary (Executive View)

  • RAG = Retrieval + Generation
  • It grounds LLM outputs in trusted, up-to-date data
  • It’s the default architecture for enterprise LLM applications
  • Most real-world LLM products today are RAG-based
  • uncheckedRe-ranking is a second-pass ranking that improves the quality of retrieved documents. Initial retrieval is fast but approximate. Re-ranking is slower but more accurate. You use both together: retrieve top 100 quickly with vectors, re-rank top 100 accurately with a stronger model.

=====================================================================

1️⃣ End-to-End Production RAG Architecture

High-Level Flow

A production RAG system has two pipelines:

  • Offline indexing pipeline
  • Online query pipeline

A. Offline Pipeline (Indexing Phase)

Step 1 Data Ingestion

Sources:

  • PDFs
  • Confluence / SharePoint
  • Databases
  • S3
  • Git repos
  • APIs

Step 2Preprocessing

  • Cleaning
  • Deduplication
  • PII masking (if required)
  • Metadata enrichment (doc type, department, ACL tags)

Step 3 Chunking Strategy

This is critical.

Common strategies:

  • Fixed token windows (e.g., 512 tokens + overlap)
  • Semantic chunking (split by section headers)
  • Recursive chunking (hierarchical)

Poor chunking = poor retrieval.


Step 4 Embeddings

Use embedding models from:

  • OpenAI
  • Cohere
  • Google

Each chunk → converted into a dense vector.


Step 5Vector Store

Stored in:

  • Pinecone
  • Weaviate
  • FAISS
  • Milvus

Metadata indexing:

  • department
  • document version
  • access control tags
  • timestamps

B. Online Pipeline (Query Time)

Step 1 — User Query

Example:

“What’s the data retention policy for EU customers?”


Step 2 — Query Processing

  • Query rewriting
  • Expansion
  • Intent detection
  • Metadata filters (e.g., EU region only)

Step 3Retrieval

Modern systems use Hybrid Retrieval:

  • Dense vector similarity
  • BM25 keyword search
  • Metadata filtering

Then:

  • Re-ranking using cross-encoder models

Step 4 — Context Construction

Top-K chunks (e.g., 5–20) are:

  • Deduplicated
  • Ordered
  • Compressed (if needed)

Inserted into prompt template:

You must answer using ONLY the context below.

If answer not found, say “Not found”.

Context:

[retrieved chunks]


Step 5Generation

LLM examples:

  • OpenAI
  • Anthropic

Output:

  • Answer
  • Citations
  • Confidence score (optional)

Step 6Observability Layer

Track:

  • Retrieval latency
  • Answer faithfulness
  • Token cost
  • Query success rate
  • Hallucination rate

Production systems ALWAYS include logging + evaluation.


2️⃣ RAG vs Fine-Tuning

Here’s the decision framework.

DimensionRAGFine-Tuning
Knowledge updatesReal-timeRequires retraining
Private dataStays externalEmbedded into weights
Hallucination controlHigh (if good retrieval)Lower
CostCheaper long termExpensive training
PersonalizationMetadata-basedStyle-based
Domain knowledge depthModerateVery deep possible

When to Use RAG

  • Dynamic knowledge
  • Large document corpora (corpora – a collection of written or spoken texts)
  • Need citations
  • Compliance-heavy domains
  • Enterprise data access control

When to Fine-Tune

  • Tone/style control
  • Structured output consistency
  • Domain language modeling
  • Classification tasks
  • Reducing prompt length

Hybrid Approach (Common in Production)

Most serious systems:

  • Use RAG for knowledge
  • Fine-tune for behavior

Example:
Fine-tuned LLM + RAG retrieval backend.


3️⃣ Common Production Failure Modes

Now we move into real problems teams face.


1. Retrieval Miss (Most Common)

Problem:
Correct answer exists in corpus but not retrieved.

Causes:

  • Bad chunking
  • Embedding mismatch
  • Query phrasing mismatch
  • Top-K too small

Fix:

  • Hybrid retrieval
  • Query rewriting
  • Better chunk granularity

2. Context Overload

Too many chunks → LLM confusion.

Symptoms:

  • Blended answers
  • Irrelevant info included
  • Long but low-quality responses

Fix:

  • Re-ranking
  • Context compression
  • Smaller top-K

3. Hallucination Despite Retrieval

LLM ignores context and fabricates.

Fix:

  • Strict prompting
  • Answer-only-from-context instructions
  • Retrieval-only fallback mode

4. Access Control Leakage

User retrieves documents they shouldn’t.

Fix:

  • Metadata-based ACL filtering before retrieval
  • Zero trust design

5. Latency Explosion

Vector search + rerank + LLM = slow.

Fix:

  • Cache embeddings
  • Smaller embedding models
  • Asynchronous re-ranking

6. Embedding Drift

Switching embedding models breaks retrieval quality.

Always re-embed full corpus if model changes.


4️⃣ Evaluation Metrics for RAG Systems

Evaluation must measure:

  • Retrieval quality
  • Generation quality
  • End-to-end performance

A. Retrieval Metrics

Measured against labeled dataset.

  • Recall@K   % of queries where correct doc in top K
  • Precision@K   % of retrieved docs relevant
  • MRR (Mean Reciprocal Rank)
  • nDCG (ranking quality metric)

B. Generation Metrics

Key metric:

1. Faithfulness (Groundedness)

Is the answer supported by retrieved context?

Measured via:

  • LLM-as-judge
  • Fact overlap scoring

2. Answer Relevance

Does answer match question?


3. Hallucination Rate

% answers containing unsupported claims.


C. End-to-End Business Metrics

Most important in production:

  • Task completion rate
  • Escalation rate (to human)
  • CSAT (if support bot)  –  A chatbot CSAT score specifically measures how pleased customers are with their interactions with your automated chatbot
  • Cost per query
  • Latency

D. Automated RAG Evaluation Frameworks

Used in production:

  • LangChain
  • LlamaIndex
  • Weights & Biases

They help measure:

  • Retrieval recall
  • Groundedness
  • Regression testing after updates

Final Executive Summary

Production RAG is NOT:

“Embed PDFs → call LLM → done.”

It is:

  • Carefully designed ingestion
  • Advanced retrieval strategies
  • Context optimization
  • Strict evaluation loops
  • Continuous monitoring

Machine Learning and Deep Learning Basics

10 Sep

1. Introduction

    Artificial Intelligence is a broad field, but most of its modern breakthroughs stem from Machine Learning (ML) and its subfield Deep Learning (DL).

      • Machine Learning focuses on algorithms that learn patterns from data and improve with experience.

      Deep Learning is a specialized subset of ML that uses neural networks with many layers to process large, complex data like images, speech, and text.

      2. Concepts & Explanations

      Machine Learning Paradigms

      Machine Learning (ML) is a core subfield of Artificial Intelligence that enables systems to learn from data and improve over time without being explicitly programmed for every task. In ML, models identify patterns and make decisions based on training data.

      Machine learning is the scientific study of algorithms and statistical models that computer systems use to effectively perform a specific task without using explicit instructions, relying on patterns and inference instead.

      Building a model by learning the patterns of historical data with some relationship between data to make a data-driven prediction

      General Architecture of Machine Learning: 

      Business understanding: Understand the given use case, and also, it’s good to know more about the domain for which the use cases are built.

      Data Acquisition and Understanding: Data gathering from different sources and understanding the data. Cleaning the data, handling the missing data if any, data wrangling, and EDA( Exploratory data analysis).

      Modeling: Feature Engineering – scaling the data, feature selection – not all features are important. We
      use the backward elimination method, correlation factors, PCA and domain knowledge to select the
      features.
      Model Training based on trial and error method or by experience, we select the algorithm and train with
      the selected features.
      Model evaluation Accuracy of the model , confusion matrix and cross-validation.
      If accuracy is not high, to achieve higher accuracy, we tune the model…either by changing the algorithm
      used or by feature selection or by gathering more data, etc.
      Deployment – Once the model has good accuracy, we deploy the model either in the cloud or Rasberry
      Pi or any other place. Once we deploy, we monitor the performance of the model. if its good…we go live
      with the model or reiterate the all process until our model performance is good.
      It’s not done yet!!!
      What if, after a few days, our model performs badly because of new data. In that case, we do all the
      process again by collecting new data and redeploy the model.

      ML can be classified into 3 main paradigms:

      Supervised Learning : Here the machine learns from  labeled data.

      Learn from labeled data (input + output).

      Example: Predicting house prices from square footage.

      Algorithms: Linear regression, decision trees, support vector machines.

      In supervised learning, the model is trained on a labeled dataset — where each input is paired with the correct output. The goal is to learn a mapping from inputs to outputs, enabling the model to make accurate predictions on new, unseen data.

      Supervised learning is classified into two categories of algorithms:
      Classification: A classification problem is when the output variable is a category, such as “Red” or “blue”, “disease” or “no disease”.
      Regression: A regression problem is when the output variable is a real value, such as     “dollars” or “weight”.

      Examples:

      •  Predicting house prices (input: size, location; output: price)
      • Email spam detection (input: email content; output: spam/not spam)
      •  Image classification (input: image pixels; output: object label)

      Common Algorithms:

      • Linear Regression
      • Logistic Regression
      • Decision Trees
      • Support Vector Machines (SVM)
      • Neural Networks

      Use Cases:
                              1. Fraud detection
                              2. Sentiment analysis
                              3. Medical diagnosis

      1. Unsupervised Learning
        • Learn from unlabeled data, finding patterns and structure.
        • Example: Grouping customers into segments based on shopping behavior.
        • Algorithms: K-means clustering, PCA (Principal Component Analysis).

      In unsupervised learning, the model is given input data without labels. The goal is to find hidden patterns, groupings, or structures within the data.

      An unsupervised model, in contrast, provides unlabelled data that the algorithm tries to make sense of by extracting features, co-occurrence and underlying patterns on its own.

      Examples:

      •  Clustering users based on browsing behavior
      •  Dimensionality reduction for data visualization
      •   Anomaly detection in network traffic

      Common Algorithms:

      • K-Means Clustering
      • Hierarchical Clustering
      • Principal Component Analysis (PCA)
      • Autoencoders

      Use Cases:

      • Market segmentation
      • Recommender systems
      • Data compression

      Reinforcement Learning (RL)

      An agent learns by interacting with an environment, receiving rewards or penalties.

      Example: Training a robot to walk, or AI to play chess.

      Reinforcement Learning (RL) is a goal-directed learning paradigm in which an agent learns to make decisions by interacting with an environment. It receives feedback in the form of rewards or penalties based on its actions and aims to maximize cumulative reward over time.

      Reinforcement learning is less supervised and depends on the learning agent in determining the output solutions by arriving at different possible ways to achieve the best possible solution.

      Examples:

      Teaching a robot to walk,

      Training an AI to play chess or Go

      Optimizing delivery routes

      Key Concepts:

      Agent: The learner or decision-maker

      Environment: Everything the agent interacts with

      Reward Signal: Feedback for good or bad behavior

      Policy: The strategy the agent uses to make decisions

      Use Cases:

      • Game AI
      • Robotics control
      • Dynamic pricing
      • Personalized recommendations

      💡 Analogy:

      • Supervised → A teacher provides answers to all practice problems.
      • Unsupervised → A student tries to find patterns in problems without answers.
      • Semi-supervised → Some answers are given, the rest must be figured out.

      Reinforcement → Learning by trial and error, like a baby learning to walk.

      Traditional ML vs. Deep Learning

      • Traditional ML
        • Relies on hand-crafted features (e.g., edge detectors in images).
        • Works well for small-to-medium datasets.
        • Examples: Decision Trees, Random Forests, SVMs.
      • Deep Learning
        • Learn features automatically from raw data using neural networks.
        • Requires large datasets and high computational power.
        • Excels in complex tasks (e.g., speech recognition, image generation).

      📘 Diagram (text description):

      • Traditional ML pipeline: Raw Data → Feature Engineering → Model Training → Prediction.
      • Deep Learning pipeline: Raw Data → Neural Network (learns features + model) → Prediction.

      2.3 Neural Networks: Architecture & Learning

      A neural network is inspired by the human brain:

      • Neurons → simple units that take input, apply a function, and pass output.
      • Layers
        • Input layer (data features).
        • Hidden layers (transformations).
        • Output layer (prediction/classification).

      💡 Analogy: Imagine a bakery:

      • Input layer → Ingredients.
      • Hidden layers → Baking process (mixing, heating, decorating).
      • Output layer → Final cake.

      2.4 Core Concepts

      • Activation Functions: Decide how much signal passes through a neuron.
        • Examples: Sigmoid, ReLU, Tanh.
      • Loss Function: Measures how far predictions are from true values.
        • Example: Mean Squared Error for regression, Cross-Entropy Loss for classification.
      • Optimizers: Algorithms that adjust model parameters to minimize loss.
        • Example: Gradient Descent, Adam optimizer.
      • Backpropagation: The process of propagating errors backward through the network to update weights.

      2.5 Evaluation Metrics

      Different tasks require different evaluation metrics:

      • Classification: Accuracy, Precision, Recall, F1-score.
      • Regression: Mean Squared Error (MSE), R² score.
      • Generative models (later chapters): BLEU score, Perplexity, Fréchet Inception Distance (FID).

      2.6 Bias, Fairness & Ethics

      AI models can inherit bias from data:

      • Example: A recruitment model trained on biased data may unfairly reject female candidates.
      • Fairness techniques: Data balancing, bias detection, and fairness-aware algorithms.
      • Ethics: Transparency, accountability, and ensuring AI benefits society.

      3. Use Cases & Applications

      • Healthcare: Predict disease risks, detect cancer from medical scans.
      • Finance: Credit scoring, fraud detection.
      • Education: Personalized learning systems.
      • Retail: Customer segmentation, demand forecasting.
      • Transportation: Autonomous driving using deep learning for object detection.

      4. Algorithms & Techniques

      Let’s explore two practical ML approaches:

      4.1 Supervised Learning Example (Classification)

      # Classifying iris flowers using scikit-learn

      from sklearn.datasets import load_iris

      from sklearn.model_selection import train_test_split

      from sklearn.tree import DecisionTreeClassifier

      from sklearn.metrics import accuracy_score

      # Load dataset

      iris = load_iris()

      X, y = iris.data, iris.target

      # Split into train and test sets

      X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

      # Train decision tree

      clf = DecisionTreeClassifier()

      clf.fit(X_train, y_train)

      # Predictions

      y_pred = clf.predict(X_test)

      print(“Accuracy:”, accuracy_score(y_test, y_pred))

      4.2 Deep Learning Example (Neural Network for Digit Recognition)

      import tensorflow as tf

      from tensorflow.keras.datasets import mnist

      from tensorflow.keras.models import Sequential

      from tensorflow.keras.layers import Dense, Flatten

      from tensorflow.keras.utils import to_categorical

      # Load dataset

      (X_train, y_train), (X_test, y_test) = mnist.load_data()

      X_train, X_test = X_train / 255.0, X_test / 255.0  # normalize

      y_train, y_test = to_categorical(y_train), to_categorical(y_test)

      # Build neural network

      model = Sequential([

          Flatten(input_shape=(28, 28)),

          Dense(128, activation=’relu’),

          Dense(10, activation=’softmax’)

      ])

      # Compile and train

      model.compile(optimizer=’adam’, loss=’categorical_crossentropy’, metrics=[‘accuracy’])

      model.fit(X_train, y_train, epochs=5, validation_data=(X_test, y_test))

      # Evaluate

      loss, accuracy = model.evaluate(X_test, y_test)

      print(“Test Accuracy:”, accuracy)


      5. Case Study / Mini-Project

      Mini-Project: Spam Email Classifier

      We’ll build a simple spam detection model using Naive Bayes.

      from sklearn.model_selection import train_test_split

      from sklearn.feature_extraction.text import CountVectorizer

      from sklearn.naive_bayes import MultinomialNB

      from sklearn.metrics import classification_report

      # Example dataset

      emails = [“Win money now!!!”, “Meeting at 3 pm”, “Get cheap loans instantly”, “Lunch tomorrow?”]

      labels = [1, 0, 1, 0]  # 1 = spam, 0 = not spam

      # Vectorize text

      vectorizer = CountVectorizer()

      X = vectorizer.fit_transform(emails)

      # Train-test split

      X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.25, random_state=42)

      # Train Naive Bayes classifier

      clf = MultinomialNB()

      clf.fit(X_train, y_train)

      # Predictions

      y_pred = clf.predict(X_test)

      print(classification_report(y_test, y_pred))

      ➡️ This shows how supervised learning can classify emails as spam or not spam.


      6. Summary

      We explored practical examples: iris classification, digit recognition, and spam detection.
      Machine Learning enables computers to learn patterns from data.

      Main paradigms: supervised, unsupervised, semi-supervised, reinforcement learning.
      Traditional ML relies on feature engineering; Deep Learning learns features automatically.
      Key concepts: activation functions, loss functions, optimizers, and backpropagation.
      Evaluation metrics help measure model performance.
      Ethical challenges (bias, fairness) must be addressed.

      ➡️ (Above content is taken by my best selling book Generative AI & Machine Learning)


      Fast API Project Structure

      14 Aug

      FastAPI project structure that incorporates multiple design patterns including:

      • Dependency Injection
      • Repository Pattern
      • Service Layer
      • Strategy Pattern
      • Factory Pattern
      • Observer Pattern
      • Builder Pattern
      • Domain-Driven Design (DDD) principles

      🧱 Project Structure

      fastapi_project/
      │
      ├── app/
      │   ├── main.py                  # FastAPI app entry point
      │   ├── config.py                # App configuration
      │   ├── dependencies/            # DI providers
      │   │   ├── db.py
      │   │   ├── auth.py
      │   │   └── __init__.py
      │   │
      │   ├── models/                  # Pydantic models
      │   │   ├── user.py
      │   │   ├── request.py
      │   │   └── __init__.py
      │   │
      │   ├── domain/                  # Domain models (DDD)
      │   │   ├── entities/
      │   │   │   ├── user_entity.py
      │   │   │   └── __init__.py
      │   │   └── __init__.py
      │   │
      │   ├── repositories/            # Repository pattern
      │   │   ├── user_repository.py
      │   │   └── __init__.py
      │   │
      │   ├── services/                # Business logic (Service Layer)
      │   │   ├── user_service.py
      │   │   └── __init__.py
      │   │
      │   ├── strategies/              # Strategy pattern
      │   │   ├── auth/
      │   │   │   ├── jwt_strategy.py
      │   │   │   ├── oauth_strategy.py
      │   │   │   └── __init__.py
      │   │   └── __init__.py
      │   │
      │   ├── factories/               # Factory pattern
      │   │   ├── service_factory.py
      │   │   └── __init__.py
      │   │
      │   ├── observers/               # Observer pattern
      │   │   ├── event_manager.py
      │   │   └── __init__.py
      │   │
      │   ├── builders/                # Builder pattern
      │   │   ├── report_builder.py
      │   │   └── __init__.py
      │   │
      │   ├── middleware/              # Custom middleware
      │   │   ├── logging_middleware.py
      │   │   └── __init__.py
      │   │
      │   ├── routes/                  # API routes
      │   │   ├── user_routes.py
      │   │   └── __init__.py
      │   │
      │   └── utils/                   # Utility functions
      │       ├── helpers.py
      │       └── __init__.py
      │
      ├── requirements.txt
      └── README.md
      

      🧩 How Patterns Fit Together

      PatternFolderPurpose
      Dependency Injectiondependencies/Inject DB, auth, config
      Repositoryrepositories/Abstract DB access
      Service Layerservices/Business logic
      Strategystrategies/Pluggable auth or processing logic
      Factoryfactories/Create services based on config
      Observerobservers/Event-driven notifications
      Builderbuilders/Construct complex objects
      DDDdomain/Domain entities and aggregates

      Conclusion: In a FastAPI project, you can implement several software design patterns to improve modularity, scalability, and maintainability. Here’s a categorized overview of the most relevant patterns and how they apply to FastAPI

      Understanding the Evolution: Generative AI, AI Agents, and Agentic AI

      12 Aug

      Introduction

      As artificial intelligence continues to evolve, it’s essential to distinguish between three foundational yet distinct paradigms: Generative AI, AI Agents, and Agentic AI. While these concepts are closely related, each represents a different level of autonomy, complexity, and capability. This guide breaks down their core differences, practical applications, and how they build upon one another.

       


      1. Generative AI: Creating Content on Demand

      What It Is

      Generative AI refers to models—like large language models (LLMs) and image generators—that produce original content based on patterns learned from massive datasets. These models are trained on diverse data (text, images, audio, video) and contain billions of parameters.

      How It Works

      Generative AI is reactive: it responds to user prompts without initiating actions or managing tasks. For example, when prompted to “write a poem about data science,” the model generates a poem but doesn’t decide to write one on its own.

      Key Features

      • Trained on large, multimodal datasets

      • Generates text, images, audio, or video

      • Requires prompt engineering to guide output

      • Examples: OpenAI’s GPT-4, Meta’s LLaMA 3

      • Supported by tools like LangChain, LlamaIndex, and Grok


      2. AI Agents: Task-Oriented Intelligence

      What They Are

      AI agents extend generative AI by adding autonomy and interactivity. They can perform specific tasks by integrating with external tools and APIs, making them more dynamic and useful in real-world applications.

      Why They Matter

      LLMs alone can’t access real-time or private data. AI agents solve this by making tool calls—requests to external systems—to fetch current or specialized information.

      Example Workflow

      1. User asks a question.

      2. Agent checks if the LLM can answer.

      3. If not, it calls an external API (e.g., for today’s news).

      4. It processes the response.

      5. It delivers a final answer to the user.

      Key Features

      • Built on LLMs with external tool integration

      • Can retrieve real-time or private data

      • Perform single, well-defined tasks

      • Still reactive, but with enhanced capabilities

      • Act autonomously within defined boundaries


      3. Agentic AI: Orchestrating Complex Workflows

      What It Is

      Agentic AI represents the next level—multi-agent systems that collaborate to complete complex, multi-step workflows. Each agent specializes in a subtask, and together they operate like a coordinated team.

      Use Case: YouTube to Blog

      An agentic AI system might:

      1. Extract a transcript from a YouTube video

      2. Generate a blog title

      3. Write a summary and description

      4. Compose a conclusion

      Each step is handled by a different agent, and outputs are passed between them to produce a polished blog post.

      Key Features

      • Multiple agents working in sequence or parallel

      • Each agent handles a specific subtask

      • Enables end-to-end automation of complex workflows

      • Supports human feedback for refinement

      • Adds adaptability and robustness through collaboration


      4. Comparative Summary

      Minimize image
      Edit image
      Delete image


      5. Strategic Implications

      Generative AI

      Ideal for creative content generation, but limited by its reactive nature. Success depends heavily on prompt quality.

      AI Agents

      Bridge the gap between static models and dynamic applications. Useful in domains like customer service, analytics, and decision support.

      Agentic AI

      Best suited for automating complex, multi-step processes. Aligns with real-world workflows and supports scalability, adaptability, and human oversight.


      Conclusion

      Understanding the distinctions between generative AI, AI agents, and agentic AI is essential for anyone working with modern AI systems. From content creation to autonomous task execution and workflow orchestration, these paradigms represent a clear evolution in capability and complexity. By choosing the right approach, organizations can unlock new levels of efficiency, creativity, and intelligence in their AI-driven solutions.