Semantic Kernel and the Microsoft Agent Framework: what they are and how they power AI agents

By Jhony Vidal

Published in Data, AI & Automation

September 27, 2025

9 min read

Semantic Kernel and the Microsoft Agent Framework: what they are and how they power AI agents

Why this matters

Note: This post was updated to include the Microsoft Agent Framework and how it relates to Semantic Kernel and Azure AI Foundry.

If you’re building apps with large language models, you’ll quickly need a way to organise prompts, call tools, track state, and work with more than one “agent”. Semantic Kernel (SK) is a practical SDK from Microsoft that helps you do exactly that.

Below is a quick, hands-on guide.

How Semantic Kernel is structured (at a glance)

More context: why Semantic Kernel, and what else could you use?

If you just want to call a single model with a fixed prompt, you don’t need a framework. But as soon as you add tools, multiple roles, state, and safety checks, an orchestration layer helps.

Why Semantic Kernel

SemanticKernel

First‑class tool calling (functions as code or prompts)
Simple plugin model; easy to wrap your own APIs
Works with OpenAI and Azure OpenAI out of the box
Import OpenAPI specs to turn HTTP APIs into tools quickly
Designed to be used as “library code” inside your app (not a hosted platform)

Other options (at a glance)

LangChain: very rich ecosystem of integrations and chains; heavier abstractions
LlamaIndex: great for retrieval/RAG pipelines and document loaders
AutoGen: focus on multi‑agent conversations with explicit role scripting
OpenAI Assistants: hosted orchestration with tools and vector store built‑in
CrewAI/LangGraph: graph‑style workflows and agent teams

Where the Microsoft Agent Framework and Azure AI Foundry fit

There are two separate questions people often mix together:

How do I write the agent logic (tools, routing, orchestration)?
Where do I run and operate the agent (deployment, auth, tracing, evaluation)?

Semantic Kernel mostly answers the first question.

Azure AI Foundry is closer to the second question: models, agent runtimes/services, evaluation, and operational tooling.

The Microsoft Agent Framework sits in the same “agent logic” space as Semantic Kernel, but it focuses on providing a consistent way to build agent applications, patterns, and samples. In practice, teams often mix these:

Use an SDK (Semantic Kernel or Agent Framework) for orchestration in code
Use Azure AI Foundry for model deployments, evaluation, and operations

High-level map:

What changed in 2025 (and why it matters in practice)

If you learned SK earlier and haven’t looked at it in a while, a few things are worth knowing:

SK has been leaning harder into agent and process-style building blocks (not only “prompt + tool calling”). Microsoft’s 2025 roadmap is a good overview of that direction.
https://devblogs.microsoft.com/semantic-kernel/semantic-kernel-roadmap-h1-2025-accelerating-agents-processes-and-integration/
Planning guidance has evolved. If you’re building tool-based agents today, prefer direct function/tool calling and deterministic routing where you can, and treat “planner” features as optional.
https://learn.microsoft.com/en-us/semantic-kernel/concepts/planning
The best learning path is still the official samples. If you want working code (not just concepts), start there.
https://learn.microsoft.com/en-us/semantic-kernel/get-started/detailed-samples

Why Microsoft introduced the Agent Framework (alongside Semantic Kernel)

The simplest way to think about it: Semantic Kernel is an SDK you use inside your app to connect models to tools and orchestrate work. The Microsoft Agent Framework is a more opinionated set of agent application patterns and samples.

Microsoft’s own SK roadmap also signals a broader direction: integrations across agent runtimes and frameworks, and smoother interoperability (for example with services such as Azure AI Foundry).
https://devblogs.microsoft.com/semantic-kernel/semantic-kernel-roadmap-h1-2025-accelerating-agents-processes-and-integration/

Technical angle (ML/AI/data)

Tool calling: expose capabilities with clear schemas. Models decide when to call them
Structured outputs: prefer JSON schemas or typed validators (e.g. Pydantic) to keep outputs reliable
State: keep turn history short and persist long‑term facts in a store (DB, vector index)
Observability: log prompts, tool calls, latency, and cost per turn. Needed for debugging and safety

1) Semantic Kernel fundamentals

At heart, SK gives you:

A Kernel: where you plug in your model(s) and your tools (called plugins)
Functions: either prompt-based or “native” code that the model can call
Memory and planners: optional helpers for context and task breakdown

Think of the Kernel as your app’s brain. You register a chat model, add a few useful functions (e.g. search, maths, time), and then ask the model to solve tasks using those functions.

Minimal set‑up (Python) — banking assistant

import os
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import (
  AzureChatCompletion, OpenAIChatCompletion,
)
from semantic_kernel.contents import ChatHistory

kernel = Kernel()

# Pick one provider
# kernel.add_service(OpenAIChatCompletion(ai_model_id="gpt-4o", api_key=os.environ["OPENAI_API_KEY"]))
kernel.add_service(
  AzureChatCompletion(
    deployment_name="gpt-4o",
    endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
  )
)

history = ChatHistory()
history.add_user_message("Hello, I’d like to check my account options.")

# Ask the default chat service to respond
chat = kernel.get_service(AzureChatCompletion)  # or OpenAIChatCompletion
reply = await chat.get_chat_message_content(history)
print(reply)

Notes

Keep secrets in env vars: AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY (or OPENAI_API_KEY).
For banking assistants, avoid echoing personal data back to the user.

2) Plugins and “auto function calling”

Plugins are simply collections of functions. A function can be:

Native: plain Python function (great for calling an API, reading a file, etc.)
Prompt: an LLM prompt wrapped as a callable function

With tool/function calling, the model can decide when to call your functions. “Auto function calling” lets the model pick and run the right function by itself.

What’s happening?

We declared the tool functions with @kernel_function.
We turned on automatic tool selection with tool_choice="auto", so the model can call them when needed.

Example: a tiny native plugin (Python) — banking tools

import os
from typing import List
from dataclasses import dataclass

from semantic_kernel import Kernel
from semantic_kernel.functions import kernel_function
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion
from semantic_kernel.contents import ChatHistory
from semantic_kernel.connectors.ai.open_ai import OpenAIChatPromptExecutionSettings


# A fake in-memory ledger just for demo purposes
_ACCOUNTS = {
  "ACC123": {"currency": "GBP", "balance": 1284.55, "owner": "J. Doe"},
  "ACC456": {"currency": "GBP", "balance": 72.10, "owner": "J. Doe"},
}

_TX = {
  "ACC123": [
    {"date": "2025-09-20", "desc": "Coffee", "amount": -3.4},
    {"date": "2025-09-19", "desc": "Salary", "amount": 2100.0},
  ],
  "ACC456": [
    {"date": "2025-09-21", "desc": "Transport", "amount": -6.2},
  ],
}


class BankingPlugin:
  @kernel_function(description="Get the available balance for an account id")
  def get_balance(self, account_id: str) -> str:
    if account_id not in _ACCOUNTS:
      return "Account not found"
    acct = _ACCOUNTS[account_id]
    return f"{acct['balance']} {acct['currency']}"

  @kernel_function(description="List the last N transactions for an account id")
  def list_transactions(self, account_id: str, n: int = 5) -> str:
    items = _TX.get(account_id, [])[:n]
    if not items:
      return "No transactions found"
    lines = [f"{t['date']} | {t['desc']} | {t['amount']:+.2f}" for t in items]
    return "\n".join(lines)


kernel = Kernel()
kernel.add_service(
  AzureChatCompletion(
    deployment_name="gpt-4o",
    endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
  )
)

# Register our banking plugin under a friendly name
kernel.add_plugin(BankingPlugin(), plugin_name="bank")

history = ChatHistory()
history.add_system_message(
  "You are a helpful banking assistant. Use tools to answer precisely."
)
history.add_user_message("What’s the balance of account ACC123, and show my last transaction?")

# Enable auto function calling so the model can call `bank.get_balance` and `bank.list_transactions`
settings = OpenAIChatPromptExecutionSettings(tool_choice="auto")

chat = kernel.get_service(AzureChatCompletion)
reply = await chat.get_chat_message_content(history, settings)
print(reply)

What’s happening?

We declared two tool functions with @kernel_function.
We turned on automatic tool selection with tool_choice="auto", so the model can call them when it needs to.

3) Import an API using an OpenAPI spec

You don’t need to hand‑code every client. Given an OpenAPI (Swagger) document, SK can import it as a plugin so the model can call those endpoints.

Tips

Start by importing read‑only endpoints; then add writes if you trust the agent.
Describe the plugin purpose in your system message so the model knows when to use it.

Example: import an OpenAPI plugin (Python) — core banking API

import os
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion
from semantic_kernel.contents import ChatHistory
# In SK Python, OpenAPI import helpers are available under the OpenAPI connector
from semantic_kernel.connectors.openapi import OpenAPIPlugin
from semantic_kernel.connectors.ai.open_ai import OpenAIChatPromptExecutionSettings

kernel = Kernel()
kernel.add_service(
  AzureChatCompletion(
    deployment_name="gpt-4o",
    endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
  )
)

# Import from a URL (can also import from a local YAML/JSON file path)
corebank = await OpenAPIPlugin.from_url("https://api.examplebank.com/openapi.yaml")
kernel.add_plugin(corebank, plugin_name="corebank")

history = ChatHistory()
history.add_system_message(
  "You can call 'corebank' to fetch balances and transactions. Avoid returning raw PII."
)
history.add_user_message("Show my latest 2 transactions for ACC123, please.")

settings = OpenAIChatPromptExecutionSettings(tool_choice="auto")
chat = kernel.get_service(AzureChatCompletion)
answer = await chat.get_chat_message_content(history, settings)
print(answer)

Notes

Start with endpoints like GET /accounts/{id} and GET /accounts/{id}/transactions.
Add guardrails in your system message (e.g. don’t expose full PANs; redact PII).

4) A simple multi‑agent conversation (with Azure AI Foundry)

Often, you’ll want more than one agent. For example: a Researcher that gathers facts, and a Writer that turns those facts into a short post. Azure AI Foundry supplies the model and deployment; SK coordinates the chat.

This is deliberately simple: no memory store, no planner, just roles handing text back and forth. You can grow it by adding plugins (search, data), guardrails, or a router that decides who speaks next.

A simple multi‑agent banking assistant (Python)

Roles

Orchestrator: decides which specialist should act.
Teller: answers balance/transactions using the banking plugin.
Risk: flags unusual patterns and suggests limits.
Compliance: checks responses for sensitive data before sending to user.

import os
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion
from semantic_kernel.contents import ChatHistory
from semantic_kernel.connectors.ai.open_ai import OpenAIChatPromptExecutionSettings

# Reuse the BankingPlugin from earlier
from typing import Optional

kernel = Kernel()
kernel.add_service(
  AzureChatCompletion(
    deployment_name="gpt-4o",
    endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
  )
)

kernel.add_plugin(BankingPlugin(), plugin_name="bank")
chat = kernel.get_service(AzureChatCompletion)
auto = OpenAIChatPromptExecutionSettings(tool_choice="auto")

# Define specialist system prompts
teller_sys = "You are a bank teller. Answer with concise facts. Use tools to get balances and transactions."
risk_sys = "You are a risk analyst. Identify anomalies or affordability risks. Keep it factual."
comp_sys = "You are compliance. Redact PII (card numbers, full addresses). Ensure tone is professional and warm."
orch_sys = "You decide which specialist acts next based on the user request and conversation state. Return one of: TELLER, RISK, COMPLIANCE."

teller = ChatHistory(); teller.add_system_message(teller_sys)
risk = ChatHistory(); risk.add_system_message(risk_sys)
comp = ChatHistory(); comp.add_system_message(comp_sys)
orch = ChatHistory(); orch.add_system_message(orch_sys)

user_query = "Could you show the last two transactions for ACC123 and check if there’s anything unusual?"
orch.add_user_message(f"User asked: {user_query}\nConversation just started.")

final_answer: Optional[str] = None
for turn in range(4):
  step = await chat.get_chat_message_content(orch)
  decision = (step.content or "").strip().upper()
  if "TELLER" in decision:
    teller.add_user_message(user_query)
    teller_answer = await chat.get_chat_message_content(teller, auto)
    orch.add_user_message(f"Teller replied: {teller_answer}")
  elif "RISK" in decision:
    risk.add_user_message(f"Data to assess: {teller_answer}")
    risk_answer = await chat.get_chat_message_content(risk)
    orch.add_user_message(f"Risk replied: {risk_answer}")
  elif "COMPLIANCE" in decision:
    comp.add_user_message(
      f"Draft to review and redact if needed.\n{teller_answer}\n{risk_answer if 'risk_answer' in locals() else ''}"
    )
    comp_answer = await chat.get_chat_message_content(comp)
    final_answer = comp_answer.content
    break
  else:
    # Fallback to teller if unsure
    teller.add_user_message(user_query)
    teller_answer = await chat.get_chat_message_content(teller, auto)
    orch.add_user_message(f"Teller (fallback) replied: {teller_answer}")

print("\nFinal (to user):\n", final_answer or teller_answer)

This pattern is pragmatic and extendable. Add routing rules, memory, or an approval step before the final message is sent.

Agent orchestration: who talks to whom?

Agent orchestration patterns (quick primer)

Patterns you’ll see in practice. Pick the simplest that solves your need:

Router (dispatcher): one orchestrator routes turns to the right specialist. Great default for banking assistants.
Supervisor–worker (hub and spoke): a manager assigns tasks to workers and reviews outputs.
Plan and execute: a planner drafts steps; an executor runs them (often calling tools) and reports back.
Critic/editor (debate/reflect): a “writer” drafts, a “critic” reviews, possibly a judge picks a final.
Blackboard (shared memory): agents read/write to a common scratchpad and act when relevant facts appear.
Graph/DAG workflow: deterministic nodes with guards and retries (nice for approvals and audits).

Banking safety tips

Treat all account identifiers as sensitive; avoid printing full names or IDs.
Keep tools read‑only at first. Introduce money‑moving endpoints only with strong safeguards.
Log tool calls and redact logs.

Practical notes

Keep prompts short and specific. State the role and how the agent should answer.
Prefer native plugins for anything that touches your systems (APIs, files, DBs).
Start with read‑only permissions. Add writes only after you trust behaviour.
Log tool calls and responses for debugging and safety.

Using SK with MCP and Azure services

Model Context Protocol (MCP) defines a standard way for tools (“resources” and “capabilities”) to be discovered and called by models/agents. You can expose your internal systems as MCP servers and let SK (or another orchestrator) call them via a thin client plugin.

How it fits

MCP server: wraps a capability (e.g. Accounts API, Payments API)
SK plugin: a client that forwards tool calls to the MCP server
Agent: decides when to call which MCP tool via SK’s auto function calling

Azure examples

Azure Key Vault: credentials/secret access for downstream tools
Azure Functions/APIM: host MCP servers or REST endpoints for bank services
Azure Cognitive Search: document and product catalogue search as a tool
Azure Storage/Table/SQL: state and logs for conversations and tool outputs
Azure Event Grid/Service Bus: async workflows (e.g. payment approvals)

Sketch (Python)

# Pseudo-client that forwards calls to an MCP server
class MCPClientPlugin:
  @kernel_function(description="Run an MCP tool by name with JSON args")
  def call(self, tool: str, args_json: str) -> str:
    # send to MCP server (over stdio/websocket/http depending on your setup)
    # return JSON result string for the model
    ...

kernel.add_plugin(MCPClientPlugin(), plugin_name="mcp")
# Now the agent can do: mcp.call(tool="accounts.get_balance", args_json="{...}")

Tip: use Azure API Management to front internal services with policies (auth, quotas, masking) before exposing them to agents.

Where RAG fits (and alternatives)

Retrieval‑Augmented Generation (RAG) fetches relevant context from a store (often a vector index) and feeds it to the model. With SK you can:

Add a retrieval tool: that queries Azure Cognitive Search or a vector DB
Keep answers grounded: cite sources in tool outputs and prompts
Trim history: rely on retrieval instead of long chat context

Alternatives/complements

Function/tool calling only: if data is in structured systems, skip vectors and query APIs directly
Fine‑tuning: train a smaller domain model for style/format; still combine with tools
Structured pipelines: use LangGraph/CrewAI for complex branching; call SK tools from nodes

When to prefer RAG

You have lots of unstructured text (policies, product sheets)
You need citations and up‑to‑date facts
You want to minimise hallucinations without heavy fine‑tuning

Visual comparison: RAG vs fine‑tuning vs tools

RAG

Fine‑tuning

Tool‑calling only (no vector store)

Evaluation and benchmarking

Why: agent systems evolve. You need repeatable checks for accuracy, safety, latency and cost.

What to measure

Task success rate: does the agent produce the expected structured output?
Groundedness: are claims supported by retrieved or tool data?
Safety/PII: no leakage of sensitive fields
Latency and cost: per turn and end‑to‑end

Useful tools

OpenAI Evals / custom judge prompts for pairwise comparisons
Ragas / Ragas‑like metrics for RAG (context precision/recall, faithfulness)
DeepEval, Promptfoo: define tests as YAML and run in CI
Azure AI Studio evals: managed runs and dashboards
Tracing: OpenTelemetry, Arize/Artemis, Langfuse for spans and prompt/tool logs

How to run it 1) Define tasks and gold outputs (or a judge prompt) for your banking flows 2) Build a skinny harness that calls your SK agents with fixed seeds and inputs 3) Record tool calls, deltas to gold, and judge scores 4) Fail the build on regressions (accuracy drops, safety violations)

Where it sits in the workflow

Dev: unit tests for tools and prompts; focused golden sets
Pre‑prod: sandbox end‑to‑end evals with synthetic and real‑like data
Prod: shadow evaluation and tracing, weekly scorecards

Machine Learning workflow: 7 stages and 6 practical steps

Seven stages (simple view)

machine learning workflow cycle

Problem definition — what’s the outcome and constraints?
Data collection — sources, access, consent
Data preparation — cleaning, joins, labelling
Data visualisation — explore patterns and leakage risks
ML modelling — pick a baseline and iterate
Feature engineering — derive signals from raw data
Model deployment — ship, observe, and improve

Six practical steps we’ll actually run

achieving ml success

Problem definition — turn the business question into an ML/agent task and success metric
Data — list what you have (structured/unstructured; batch/stream); map to the task
Evaluation — choose a metric and sign‑off threshold (e.g. 95% exact match on statements)
Features — decide which fields matter; add derived features carefully
Modelling — pick a model or agent pattern; compare baselines vs tool‑augmented agents
Experimentation — try variants, measure, and feed results back into the loop

How agents and SK fit

SK handles orchestration, tool calling, and structured outputs.
Retrieval (RAG) supplies fresh facts; tools fetch system truth; outputs are validated.
You still need a proper evaluation loop and deployment hygiene around the agent.

Azure mapping (one way to wire it)

Guardrails, validation, and traceability

ai application safeguards

Guardrails

System prompts that forbid PII echoing, risky actions and unapproved tools
Tool scopes: start read‑only; require approvals for payments or data export
Rate limits/quotas: per user, per agent, per tool

Validation

Pydantic/typed schemas for tool inputs and model outputs
Redaction filters for logs and responses (e.g. mask IBAN/PAN patterns)
Policy checks: allow/deny lists by user role and time of day

Traceability

Correlate every turn with a trace id; include model, temperature, tool list
Store prompts, tool I/O, and decisions; keep only what you must, hashed where needed
Add human‑in‑the‑loop approvals for sensitive actions (limit increases, transfers)

Best practices, challenges, and tips

Best practices

how to optimise ai development

Keep prompts short; prefer explicit formats and JSON outputs
Treat tools like APIs: version, monitor, and test them
Separate orchestration from business logic; keep plugins focused
Log everything important; sample if needed to control cost

Common challenges

Hallucinations when no tool fits: return “I don’t know” with a helpful next step
Context bloat: use retrieval and summaries; trim aggressively
Flaky tool calling: add retries with jitter and idempotency keys
Compliance: design for redaction and approvals from the start

Tips

Start with one or two high‑value tools and one agent role; grow from there
Use staging sandboxes and synthetic accounts to test safely
Prefer Azure services you already trust for identity, secrets, and networking

Keep exploring

Semantic Kernel (GitHub): https://github.com/microsoft/semantic-kernel
Semantic Kernel (Microsoft Learn overview): https://learn.microsoft.com/en-us/semantic-kernel/overview/
Semantic Kernel roadmap (H1 2025): https://devblogs.microsoft.com/semantic-kernel/semantic-kernel-roadmap-h1-2025-accelerating-agents-processes-and-integration/
Microsoft Agent Framework samples (GitHub): https://github.com/microsoft/Agent-Framework-Samples
Azure AI Foundry: https://ai.azure.com
OpenAPI basics: https://www.openapis.org
Banking assistant sample (Azure-Samples): https://github.com/Azure-Samples/agent-openai-python-banking-assistant
Same sample, Semantic Kernel branch: https://github.com/Azure-Samples/agent-openai-python-banking-assistant/tree/semantic-kernel

Videos to watch

If you want a walkthrough with visuals, start here:

And this one focuses on differences in approaches:

Azure AI Foundry vs Microsoft Agent Framework (practical comparison)

This is not a “winner” table. It’s a choice of where you want the complexity to live.

Area	Azure AI Foundry	Microsoft Agent Framework
Primary job	Platform for deploying/operating AI apps and agents	SDK/patterns for building agent apps in code
Where your agent logic lives	Often split: some logic in code, some in managed services/config	Mostly in your repo (code-first), with your own tests and CI
Model access	Managed model deployments and governance (platform-led)	You bring the model client and credentials (code-led)
Tools/integrations	First-class platform connectors + managed patterns	You implement tools as code, wrap APIs, and choose your own integrations
Evaluation	Built for eval runs, dashboards, and safety checks	Usually you wire your own eval harness (or use platform tools)
Observability	Platform-level tracing/monitoring integrations	You choose tracing/logging (OpenTelemetry, vendor tools, etc.)
Best when	You want a managed path from prototype to operated agent	You want maximum control, portability, and code ownership
Trade-offs	You adopt platform conventions and services	You own more operational wiring unless you add a platform layer

In many real systems:

Foundry is the operational layer (models, evaluation, monitoring)
Agent Framework and/or Semantic Kernel are the orchestration layer (tools, routing, multi-agent logic)

BAIgenty - AI Business Analyst Agent

Agent architectures in practice: patterns, platforms, and use cases