localmTUTS
FollowFollowSubscribe
💻Video + Code Examples·6 mins

A2A with Claude Style Agents

Build an agent using no framework at all — a manual agentic loop with JSON-schema tool definitions, an explicit dispatch table, and conversation memory. Shows what every framework does under the hood. Powered by Kimi-K2-Thinking via Azure AI Foundry.

A2A with Claude Style Agents · 6 mins
Instructor:Welcome back to LocalM Tuts. I am Nilay Parikh. This is Lesson 13 of 16, A2A with Claude Agent SDK. Last time we built the task agent with OpenAI Agents SDK: tool use, A2A wrapping. If you are watching this as a standalone video. Find the complete course playlist linked below in the description. Also you can find example links via the GitHub repository and interactive page links in the description below. Clone the repo, open the lesson folder and just follow. Loan validation agent, built from scratch. No
Instructor:framework. Structured JSON-schema tool definitions, explicit tool-call dispatch, and a manual agentic loop using the AsyncAzureOpenAI with K2 Thinking. Multi-turn is a big unlock. Most A2A interactions so far have been single shot send query get response with the conversation memory. The agent remembers what we said before, enabling the follow-up questions and interactive workflows. Let's see this in action. Install the dependencies, configure the credentials, then start the server on port
Instructor:10,006. Watch the manual tool call loop iterate. You will see each tool call result in the console. Try sending a follow up question. Test the multi-turn memory. Pause the video and try it yourself. Lesson 13 practical implementation using Claude-style agent patterns. So by the way, I got a one typer there. Claude agent. Okay, it is wrong. I don't know how I missed that. It's basically a Claude-style agent pattern implementation. So what I did in this example is what I learned from Claude, what I read
Instructor:from Claude, covered Claude Code, their blogs, papers, how they are writing their own agents. And I really like the approach they are doing it. So what I did is I actually copied their approach into this example. Of course it is not as detailed as they do, but it will give you some idea. And I want to explore that how compliant it is with A2A. So I did this JSON-schema tool definition the way Claude does. The more importantly what I did is the system instruction because there are couple of. System prompt
Instructor:widely, then I was aware of what those big prompts look like, so I haven't copied it, but I basically used the style and then manual tool called loop with conversation memory. This is the one of the most powerful features that Claude, why Claude Code makes Claude broad. Conversation memory. This is very important and unlike other framework with which they explicitly manage, I am actually manages everything manually here. And this is what this particular example make a difference. You
Instructor:can definitely take this example and try to run it. I have just implemented one or 2 features means there are hundreds of features like that what they got. So I think let's not compare that. I have created their own work with something very similar and same. A2A to connect, and server to host gateway. So lets see lets run the server first, python server, and we are running it. Um, and then we just wrapped. python client. Let's see, server up and running. Yep, it's running, and then let's run the Python client.
Instructor:I think they got it there. Yeah. Sorry, Exit 13, isn't it? Yeah. Claude-style, that's it. Just generated that 5 and we got. the whole loop running. Solution sorted so. That's all probably. See you in the next part of the video. And all 6 frameworks are complete. You have agents running A2A with the same Microsoft Agent Framework, ADK, LangGraph, CrewAI, OpenAI, and
Instructor:Claude patterns. All discoverable, interoperable and Next up the most important the capstone example, which is a very real life example of a loan approval pipeline using 5 different agents orchestrating and having the human in loop. Thanks for watching this lesson on LocalM Tuts. Next is the lesson of the production grade loan approval system with 5 specialized agents and a human in loop 6th with a React dashboard with full observability. You can find the next video in the A2A Protocol course
Instructor:playlist in the description. I see you there.
Learning Objectives5
  • Define tools as JSON schemas with explicit parameter specifications
  • Build a dispatch table mapping tool names to handler functions
  • Implement the manual agentic loop: send → check tool_calls → execute → repeat
  • Manage conversation memory with explicit message accumulation
  • Wrap the manual agent as an A2A server with per-task state

Run the Lesson 13 Example

0/5

Clone the examples

git clone https://github.com/nilayparikh/tuts-agentic-ai-examples.git
cd tuts-agentic-ai-examples/a2a/lessons/13-claude-agent-sdk

Set up virtual environment

cd ../../
python -m venv .venv
.venv/Scripts/Activate.ps1
pip install -r requirements.txt

Configure credentials

# _examples/.env
AZURE_OPENAI_ENDPOINT, AZURE_AI_API_KEY, AZURE_AI_MODEL_DEPLOYMENT_NAME=Kimi-K2-Thinking

Start the A2A server

cd lessons/13-claude-agent-sdk/src
python server.py

Run the A2A client

cd lessons/13-claude-agent-sdk/src
python client.py
nilayparikh/tuts-agentic-ai-examples/tree/main/a2a/lessons/13-claude-agent-sdkGitHub

Complete source code for this lesson.

github.com/nilayparikh/tuts-agentic-ai-examples/tree/main/a2a/lessons/13-claude-agent-sdk
Q&A

Q & A

Q

Why call it 'Claude Style Agents' if the model is Kimi-K2-Thinking?

The lesson applies Anthropic-inspired agent-building patterns — explicit JSON-schema tools, manual dispatch, and conversation-level state. Using Kimi-K2-Thinking proves those agent patterns are model-agnostic.

Q

What is the safety limit (max iterations) for?

The for _ in range(6) loop prevents infinite loops. Most tasks resolve in 2-4 iterations. Production systems may use token budgets or time limits instead.

Q

Why maintain per-task agent instances in the executor?

Each OrchestratorAgent holds its own conversation history. Keying by task_id means multi-turn A2A conversations reuse the same history. New tasks get fresh instances.

Q

Is there a performance difference between the manual loop and frameworks?

Negligible. LLM inference dominates execution time. The manual loop adds microseconds of Python overhead vs seconds of model inference. Framework overhead is invisible relative to API latency.