localmTUTS
FollowFollowSubscribe
💻Video + Code Examples·5 mins

The Parallel Agent Pattern — Deep Dive

Concurrent execution for independent subtasks using fan-out/fan-in architecture. Covers parallel dispatch, session state collection, and sequential aggregation.

The Parallel Agent Pattern — Deep Dive · 5 mins
Instructor:Three independent research tasks: sequential will take 12 seconds, parallel might take 4. The parallel pattern trades strict ordering for concurrent speed. And you're about to see exactly how fan-out and fan-in work. Plus, there is one constraint that I would like to make sure we remember when processing in parallel. Let's understand using a very simple example. You send a market research query, which requires cloud computing trends to be analyzed, competitor positioning, and developer sentiment to be analyzed.
Instructor:Totally three independent research tasks kicked off from one query. The ParallelAgent fires all three branches at once, no waiting for anybody. They all are running concurrently, so what you have to look at is the slowest branch, not the sum of all three. The Trends Analyst focuses on market trajectory. It figures out adoption rates, growth segments, emerging technology, any trends, everything. Whatever it finds goes into session state, and then let's say that becomes trends_output.
Instructor:At the exact same time, the Competitor Analyst is mapping the competitive landscape, market share, features, pricing, it doesn't need anything from the trends branch. Completely independent. And the Sentiment Analyst is evaluating developer concerns, such as community activity, satisfaction, pain points. All the three branches and the prompts are running at once. Now here is the fan-in. The SequentialAgent reads all three output keys from session state,
Instructor:which are trends_output, competitor_output, sentiment_output, and synthesizes them into one report. The ParallelAgent will give you the speed, the SequentialAgent will give you the merge. The final report will synthesize from all the tracks, combining them into one structure, and then total latency: The slowest branch plus merge. With three branches, let's say each takes four seconds - that's eight seconds instead of twelve if we process sequentially.
Instructor:It's a very simple math: three branches, four seconds, twelve seconds for sequential, and this is the slowest branch plus merge, which is eight seconds for the parallel. Now, bump it to five branches, and the saving would be even more. So why parallel? Speed. Your total time is just the slowest branch, not the sum. And the scalability - you can virtually scale to any extent, any level. And understanding the latency - the slowest branch plus merge - that's always going to be the static part.
Instructor:And more importantly, isolation: one branch failing does not take other branches down. That's one of the core benefits of parallel. And here's the one hard design rule. Parallel processing cannot break that, and that is: all the branches must be independent. If Branch B depends somehow on Branch A, then you don't have a parallel problem. You've got a sequential dependency, not a parallel pattern. Beyond that, watch out for resource contention. Also, rate limited API could be difficult.
Instructor:If the large number of the branches, or the significant number of the branches, has to fan out, In a very quick succession, merge complexity is also the area that parallel processing should look for. And especially when the output conflicts, how the conflict resolution should take place, and also the debugging is likely comparatively more complex than sequential processing. But despite that, parallel has its own value and merit. So now let's see it in action with Visual Studio Code and see how it works.
Instructor:So I'm gonna move to Visual Studio Code and let's see. I already got the server up and running, and all I'm gonna do is I'm gonna type out the prompt for the client and I'm just gonna fire the client. It says we've got an Ollama local model running and we're just continuing with our example package, which is the trip planner. We are trying to plan a trip with the concert, museum, and restaurant finder, and we're using the A2A protocol to connect with all
Instructor:three agents. There are three agents and an orchestrator running inside the server, and it's now synthesizing. So let's see. So we need the San Francisco one, if I'm not wrong. Yes, San Francisco is here, and it defined our itinerary based on what we asked for. We asked for restaurant, it is dinner and lunch, and we asked for museum, and we also asked for concert, and it did pretty well. And the same thing in terms of museums. So we also asked for another query, so I'm going
Instructor:for Tokyo, so that gives me something extra. So it works pretty much as anticipated. That's a parallel pattern. Now, two independent branches fan in with the merger, and that lets you see why it can do more. Together with single and sequential, now you've got three building blocks that cover most real-world agent systems. Before you ever need coordinators and loops, just one prototype probably can do more. In my experience, most real-world agent AI are within the realm of single agent,
Instructor:parallel agent and sequential agent. It provides the complete DAG architecture. And then there are only very specialized use cases where coordinators and loops are required. The simpler the solution is, the better for manageability and overall performance of AI systems. Unless we need it, there is no point extending and living with complex AI architecture, which generally tends to be more difficult to maintain and manage. for desired results and outcomes in future,
Instructor:that's my overall observation in the last couple of years since I'm working with agent AI. So thank you very much for tuning in, and I expect to see you in next videos. Please make sure you subscribe and also save the playlist because we have an upcoming more agent AI patterns video in this particular playlist. Thank you very much, and I'll see you in the next video.
Learning Objectives5
  • Define the parallel agent pattern and explain how concurrent execution reduces latency
  • Describe the fan-out/fan-in architecture: parallel dispatch followed by sequential aggregation
  • Explain how sub-agents write to shared session state independently
  • Identify when parallel execution is the correct choice versus sequential or single agent
  • Evaluate trade-offs: latency versus cost, aggregation complexity, branch independence

Setup Instructions

0/3

Clone the repository

Clone the course repository to your local machine to follow along with the code examples.

bash
git clone https://github.com/nilayparikh/tuts-agentic-ai-examples/tree/main/agents/mono/agent-design-patterns-1/03-parallel-agents
cd $(basename https://github.com/nilayparikh/tuts-agentic-ai-examples/tree/main/agents/mono/agent-design-patterns-1/03-parallel-agents)

Create a virtual environment

Create an isolated Python environment for the project dependencies.

bash
python -m venv .venv
source .venv/bin/activate  # or .venv\Scripts\activate on Windows

Install dependencies

Install all required packages from the requirements file.

bash
pip install -r requirements.txt
nilayparikh/tuts-agentic-ai-examples/tree/main/agents/mono/agent-design-patterns-1/03-parallel-agentsGitHub

Complete source code for this lesson.

github.com/nilayparikh/tuts-agentic-ai-examples/tree/main/agents/mono/agent-design-patterns-1/03-parallel-agents
Q&A

Q & A

Q

Can parallel sub-agents communicate with each other during execution?

No. Sub-agents within a ParallelAgent run independently in their own execution branches. There is no automatic sharing of state or data during execution. If you need inter-branch communication, restructure: make the dependent part sequential, then parallelize the truly independent work.

Q

What happens if one parallel branch fails?

The other branches still complete successfully. However, the aggregator will receive incomplete data for the failed branch. You need explicit handling: skip the missing data with a disclaimer, retry the failed branch, or report partial results.

Q

Why wrap ParallelAgent inside a SequentialAgent?

To guarantee the aggregation step (synthesizer) runs only after all parallel branches have completed. Without the sequential wrapper, there is no guarantee that results are available when the synthesizer tries to read them. The wrapper enforces: parallel first, then aggregate.