
Note: This post was updated to include the Microsoft Agent Framework and how it relates to Semantic Kernel and Azure AI Foundry.
If you’re building apps with large language models, you’ll quickly need a way to organise prompts, call tools, track state, and work with more than one “agent”. Semantic Kernel (SK) is a practical SDK from Microsoft that helps you do exactly that.
Below is a quick, hands-on guide.
If you just want to call a single model with a fixed prompt, you don’t need a framework. But as soon as you add tools, multiple roles, state, and safety checks, an orchestration layer helps.
Why Semantic Kernel
Other options (at a glance)
There are two separate questions people often mix together:
Semantic Kernel mostly answers the first question.
Azure AI Foundry is closer to the second question: models, agent runtimes/services, evaluation, and operational tooling.
The Microsoft Agent Framework sits in the same “agent logic” space as Semantic Kernel, but it focuses on providing a consistent way to build agent applications, patterns, and samples. In practice, teams often mix these:
High-level map:
If you learned SK earlier and haven’t looked at it in a while, a few things are worth knowing:
The simplest way to think about it: Semantic Kernel is an SDK you use inside your app to connect models to tools and orchestrate work. The Microsoft Agent Framework is a more opinionated set of agent application patterns and samples.
Microsoft’s own SK roadmap also signals a broader direction: integrations across agent runtimes and frameworks, and smoother interoperability (for example with services such as Azure AI Foundry).
https://devblogs.microsoft.com/semantic-kernel/semantic-kernel-roadmap-h1-2025-accelerating-agents-processes-and-integration/
Technical angle (ML/AI/data)
At heart, SK gives you:
Think of the Kernel as your app’s brain. You register a chat model, add a few useful functions (e.g. search, maths, time), and then ask the model to solve tasks using those functions.
import osfrom semantic_kernel import Kernelfrom semantic_kernel.connectors.ai.open_ai import (AzureChatCompletion, OpenAIChatCompletion,)from semantic_kernel.contents import ChatHistorykernel = Kernel()# Pick one provider# kernel.add_service(OpenAIChatCompletion(ai_model_id="gpt-4o", api_key=os.environ["OPENAI_API_KEY"]))kernel.add_service(AzureChatCompletion(deployment_name="gpt-4o",endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],api_key=os.environ["AZURE_OPENAI_API_KEY"],))history = ChatHistory()history.add_user_message("Hello, I’d like to check my account options.")# Ask the default chat service to respondchat = kernel.get_service(AzureChatCompletion) # or OpenAIChatCompletionreply = await chat.get_chat_message_content(history)print(reply)
Notes
AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY (or OPENAI_API_KEY).Plugins are simply collections of functions. A function can be:
With tool/function calling, the model can decide when to call your functions. “Auto function calling” lets the model pick and run the right function by itself.
What’s happening?
@kernel_function.tool_choice="auto", so the model can call them when needed.import osfrom typing import Listfrom dataclasses import dataclassfrom semantic_kernel import Kernelfrom semantic_kernel.functions import kernel_functionfrom semantic_kernel.connectors.ai.open_ai import AzureChatCompletionfrom semantic_kernel.contents import ChatHistoryfrom semantic_kernel.connectors.ai.open_ai import OpenAIChatPromptExecutionSettings# A fake in-memory ledger just for demo purposes_ACCOUNTS = {"ACC123": {"currency": "GBP", "balance": 1284.55, "owner": "J. Doe"},"ACC456": {"currency": "GBP", "balance": 72.10, "owner": "J. Doe"},}_TX = {"ACC123": [{"date": "2025-09-20", "desc": "Coffee", "amount": -3.4},{"date": "2025-09-19", "desc": "Salary", "amount": 2100.0},],"ACC456": [{"date": "2025-09-21", "desc": "Transport", "amount": -6.2},],}class BankingPlugin:@kernel_function(description="Get the available balance for an account id")def get_balance(self, account_id: str) -> str:if account_id not in _ACCOUNTS:return "Account not found"acct = _ACCOUNTS[account_id]return f"{acct['balance']} {acct['currency']}"@kernel_function(description="List the last N transactions for an account id")def list_transactions(self, account_id: str, n: int = 5) -> str:items = _TX.get(account_id, [])[:n]if not items:return "No transactions found"lines = [f"{t['date']} | {t['desc']} | {t['amount']:+.2f}" for t in items]return "\n".join(lines)kernel = Kernel()kernel.add_service(AzureChatCompletion(deployment_name="gpt-4o",endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],api_key=os.environ["AZURE_OPENAI_API_KEY"],))# Register our banking plugin under a friendly namekernel.add_plugin(BankingPlugin(), plugin_name="bank")history = ChatHistory()history.add_system_message("You are a helpful banking assistant. Use tools to answer precisely.")history.add_user_message("What’s the balance of account ACC123, and show my last transaction?")# Enable auto function calling so the model can call `bank.get_balance` and `bank.list_transactions`settings = OpenAIChatPromptExecutionSettings(tool_choice="auto")chat = kernel.get_service(AzureChatCompletion)reply = await chat.get_chat_message_content(history, settings)print(reply)
What’s happening?
@kernel_function.tool_choice="auto", so the model can call them when it needs to.You don’t need to hand‑code every client. Given an OpenAPI (Swagger) document, SK can import it as a plugin so the model can call those endpoints.
Tips
import osfrom semantic_kernel import Kernelfrom semantic_kernel.connectors.ai.open_ai import AzureChatCompletionfrom semantic_kernel.contents import ChatHistory# In SK Python, OpenAPI import helpers are available under the OpenAPI connectorfrom semantic_kernel.connectors.openapi import OpenAPIPluginfrom semantic_kernel.connectors.ai.open_ai import OpenAIChatPromptExecutionSettingskernel = Kernel()kernel.add_service(AzureChatCompletion(deployment_name="gpt-4o",endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],api_key=os.environ["AZURE_OPENAI_API_KEY"],))# Import from a URL (can also import from a local YAML/JSON file path)corebank = await OpenAPIPlugin.from_url("https://api.examplebank.com/openapi.yaml")kernel.add_plugin(corebank, plugin_name="corebank")history = ChatHistory()history.add_system_message("You can call 'corebank' to fetch balances and transactions. Avoid returning raw PII.")history.add_user_message("Show my latest 2 transactions for ACC123, please.")settings = OpenAIChatPromptExecutionSettings(tool_choice="auto")chat = kernel.get_service(AzureChatCompletion)answer = await chat.get_chat_message_content(history, settings)print(answer)
Notes
GET /accounts/{id} and GET /accounts/{id}/transactions.Often, you’ll want more than one agent. For example: a Researcher that gathers facts, and a Writer that turns those facts into a short post. Azure AI Foundry supplies the model and deployment; SK coordinates the chat.
This is deliberately simple: no memory store, no planner, just roles handing text back and forth. You can grow it by adding plugins (search, data), guardrails, or a router that decides who speaks next.
Roles
import osfrom semantic_kernel import Kernelfrom semantic_kernel.connectors.ai.open_ai import AzureChatCompletionfrom semantic_kernel.contents import ChatHistoryfrom semantic_kernel.connectors.ai.open_ai import OpenAIChatPromptExecutionSettings# Reuse the BankingPlugin from earlierfrom typing import Optionalkernel = Kernel()kernel.add_service(AzureChatCompletion(deployment_name="gpt-4o",endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],api_key=os.environ["AZURE_OPENAI_API_KEY"],))kernel.add_plugin(BankingPlugin(), plugin_name="bank")chat = kernel.get_service(AzureChatCompletion)auto = OpenAIChatPromptExecutionSettings(tool_choice="auto")# Define specialist system promptsteller_sys = "You are a bank teller. Answer with concise facts. Use tools to get balances and transactions."risk_sys = "You are a risk analyst. Identify anomalies or affordability risks. Keep it factual."comp_sys = "You are compliance. Redact PII (card numbers, full addresses). Ensure tone is professional and warm."orch_sys = "You decide which specialist acts next based on the user request and conversation state. Return one of: TELLER, RISK, COMPLIANCE."teller = ChatHistory(); teller.add_system_message(teller_sys)risk = ChatHistory(); risk.add_system_message(risk_sys)comp = ChatHistory(); comp.add_system_message(comp_sys)orch = ChatHistory(); orch.add_system_message(orch_sys)user_query = "Could you show the last two transactions for ACC123 and check if there’s anything unusual?"orch.add_user_message(f"User asked: {user_query}\nConversation just started.")final_answer: Optional[str] = Nonefor turn in range(4):step = await chat.get_chat_message_content(orch)decision = (step.content or "").strip().upper()if "TELLER" in decision:teller.add_user_message(user_query)teller_answer = await chat.get_chat_message_content(teller, auto)orch.add_user_message(f"Teller replied: {teller_answer}")elif "RISK" in decision:risk.add_user_message(f"Data to assess: {teller_answer}")risk_answer = await chat.get_chat_message_content(risk)orch.add_user_message(f"Risk replied: {risk_answer}")elif "COMPLIANCE" in decision:comp.add_user_message(f"Draft to review and redact if needed.\n{teller_answer}\n{risk_answer if 'risk_answer' in locals() else ''}")comp_answer = await chat.get_chat_message_content(comp)final_answer = comp_answer.contentbreakelse:# Fallback to teller if unsureteller.add_user_message(user_query)teller_answer = await chat.get_chat_message_content(teller, auto)orch.add_user_message(f"Teller (fallback) replied: {teller_answer}")print("\nFinal (to user):\n", final_answer or teller_answer)
This pattern is pragmatic and extendable. Add routing rules, memory, or an approval step before the final message is sent.
Patterns you’ll see in practice. Pick the simplest that solves your need:
Banking safety tips
Model Context Protocol (MCP) defines a standard way for tools (“resources” and “capabilities”) to be discovered and called by models/agents. You can expose your internal systems as MCP servers and let SK (or another orchestrator) call them via a thin client plugin.
How it fits
Azure examples
Sketch (Python)
# Pseudo-client that forwards calls to an MCP serverclass MCPClientPlugin:@kernel_function(description="Run an MCP tool by name with JSON args")def call(self, tool: str, args_json: str) -> str:# send to MCP server (over stdio/websocket/http depending on your setup)# return JSON result string for the model...kernel.add_plugin(MCPClientPlugin(), plugin_name="mcp")# Now the agent can do: mcp.call(tool="accounts.get_balance", args_json="{...}")
Tip: use Azure API Management to front internal services with policies (auth, quotas, masking) before exposing them to agents.
Retrieval‑Augmented Generation (RAG) fetches relevant context from a store (often a vector index) and feeds it to the model. With SK you can:
Alternatives/complements
When to prefer RAG
RAG
Fine‑tuning
Tool‑calling only (no vector store)
Why: agent systems evolve. You need repeatable checks for accuracy, safety, latency and cost.
What to measure
Useful tools
How to run it 1) Define tasks and gold outputs (or a judge prompt) for your banking flows 2) Build a skinny harness that calls your SK agents with fixed seeds and inputs 3) Record tool calls, deltas to gold, and judge scores 4) Fail the build on regressions (accuracy drops, safety violations)
Where it sits in the workflow
Seven stages (simple view)
Six practical steps we’ll actually run
How agents and SK fit
Guardrails
Validation
Traceability
Best practices
Common challenges
Tips
If you want a walkthrough with visuals, start here:
And this one focuses on differences in approaches:
This is not a “winner” table. It’s a choice of where you want the complexity to live.
| Area | Azure AI Foundry | Microsoft Agent Framework |
|---|---|---|
| Primary job | Platform for deploying/operating AI apps and agents | SDK/patterns for building agent apps in code |
| Where your agent logic lives | Often split: some logic in code, some in managed services/config | Mostly in your repo (code-first), with your own tests and CI |
| Model access | Managed model deployments and governance (platform-led) | You bring the model client and credentials (code-led) |
| Tools/integrations | First-class platform connectors + managed patterns | You implement tools as code, wrap APIs, and choose your own integrations |
| Evaluation | Built for eval runs, dashboards, and safety checks | Usually you wire your own eval harness (or use platform tools) |
| Observability | Platform-level tracing/monitoring integrations | You choose tracing/logging (OpenTelemetry, vendor tools, etc.) |
| Best when | You want a managed path from prototype to operated agent | You want maximum control, portability, and code ownership |
| Trade-offs | You adopt platform conventions and services | You own more operational wiring unless you add a platform layer |
In many real systems:
Legal Stuff