💻Video + Code Examples·9 mins

Defining the Pipeline Genome

Nilay Parikh

Define the one mutable pipeline genome for CleanLoop. This lesson reconnects to the Lesson 01 contract, shows why one file and one fixed judge keep mutation auditable, and walks the runtime surface where deterministic cleanup hands off to the mutation playbook.

Thumbnail for Defining the Pipeline Genome — Defining the Pipeline Genome · 9 mins

Transcript34 entries

Instructor:Most self-improving systems don't fail because the idea is wrong. They fail because they are mutating the wrong thing. If the error surface is too broad or too narrow or too vague, they will get noisy very easily. It loses the clarity and the rollback strategy guesswork. So the lesson focus on one thing: defining the genome. Once the mutation surface is clearly scoped, every decision that follows will become easier and absolutely reason to abort.

Instructor:So here is the first thing. At the high level, think of this course as a one-boundary system. Each lesson adds some mechanism. But the earlier contract or earlier lesson is always running underneath. So I would recommend if you haven't watched lesson one, you will go at lesson one as well to make sure you have a connected and in-depth context. However, if you understand this concept, misconception. We will have a small recap in a hands-on session as well.

Instructor:Lesson one: we establish the core contract, the mutation surface, one fixture, and the evidence trail, and the end to end run to understand how the overall context could look like. The choice of our framework sits on top of the orchestration seam. Not on the correctness boundary. That's important. This lesson builds directly on the top of that foundation. We are not restarting. We are extending where we left. The same mutation engine by defining the pipeline genome more precisely.

Instructor:So keep the focus of the mind from the lesson one and we have a quick recap in hands-on as well. Everything from earlier lesson is still active, and this piece will only add some more work because the structure is already in place. Here is a thing to memorize: from a full genome, that fits a single file. That's what makes autonomous notation actually reviewable instead of just looking impressive. However, I must tell you in some real cases where you might have multiple genome files in a one scope, and you should.

Instructor:We do have, but that's very much based on architecture by architecture and case by case basis. So don't be afraid if there are multiple genome, and don't be afraid if there is a there is a single scope genome. They both work as long as the case is sufficient to make those calls. A failure signal only useful if it maps clearly something fixable, and that's what this lesson is all about: how to identify what is fixable. We just don't want to draw a sort of infinite wish list to LLM to fix it,

Instructor:because that will most likely lead to hallucination and incorrect and over pressing LLM inference over the data. So we need to first understand very quickly that what is fixable. A bounded genome turns a vague issue into concrete diffs that the loop can actually act upon, and that's what saw in the lesson one hands on as well. How does it pick it up each and every use cases or each and every this interfaces? And during during genome.

Instructor:Giving the genome file one is in just a minor friendly, it's actually human friendly as well. But as said, it could be multiple as well. It's easier to inspect and easier to find a defect in a git, because git is also underneath. A part of our solution, and it's part of the rollback strategy, and it's easier to reset. As I said, trust come from trying every different, from measurable outcome. That's where I think majority of the mutable agents do fail. How do we measure a mutation?

Instructor:Is it just the binary success of failure, or is it something the quality of a success of failure? Well, we'll cover that in a later lesson as well, but it's important to understand. This is the one of the core principles that's why we will make such choices. The outcome banner and the trust signal make it possible to evaluate changes based on the evidence, not on the intuition. So the core idea is pretty much simple. Failure lands on a single same different state small.

Instructor:The result are measurable, and that's what makes a genome safe to evolve. Now, let's open the genome file in our hands-on lab where we go straight now, and make sure you have the GitHub checkout. The link is in description below, or comments wherever you're watching this particular lesson. And it would be useful to be ready beforehand. We start the hands-on lab, so even person get everything ready for you. So that's it. Let's go on this tutorial code and let's figure it out from there.

Instructor:So we are on the tutorial code. I would this time, last time I suggested to start with reading. This time I would say, because the lesson one gave you a good understanding of what a genome and on look like, so you can actually start from Lesson Two itself. And here is the preview of lesson two I have created. So it would be easier for everyone to just quickly go through this to understand. And if you are only coming from a this video itself, you can actually go on lesson one to give a quick.

Instructor:Now, where we left this was the whole example, and we left here. You can see the fast fading trigger, what it says the fading signals. The next chapter, and at that time I said just leave it there, we'll pick it up in the next chapter. And this is what the flow of the first. Let's and I'm sure that you have finished those exercises as well. Take, give or take some advanced exercises to make sure that learning has actually happened across. Now let's go back on a lesson two.

Instructor:Where you see the fading signals coming from there through the focus lens, which I already discussed. I'm not gonna waste time here. That is the execution flow. You can actually go in that file, which is in the architecture. And let's load the execution flow, which has all the execution flow across our codebase. So this is a master execution. This is how the actual flow look like for the full execution. Full full, um, um, clean clean look, uh, example.

Instructor:From lesson one, lesson one has a very any much sort of a execution flow from lesson two. It will help you understand. The process happen is far input side been read by this particular function. So you can actually go and query in the function as well. Normalize the numerical amount. That's kind of a, and um, this, what we call it, um, data analytic analysis where it's happening. And trying to understand what is fixable and what is not fixable because we want to keep bounded and that's

Instructor:not just in the limit of, of physical file, but also understanding what we are trying to solve with it. We are trying to not solving a problem of a universal, um, data engineer. We are using uh harness or AI agents. We are particularly looking for a use case where finance CSVs are having discrepancies in the format, shuffling in the row and inconsistencies. We are particularly focusing on that problem. So the boundedness, the bounded context or bounded mutation is not about bounding a physical limit of a file,

Instructor:of course that's part of it. But it's also about bounding it towards the functionality and what the problem we are trying to solve. That's a very important distinction. Everyone need to understand when they make this word meaning for said bounded mutation. It's a bounding from physical aspect of it. It's a bounding from a problem solution aspect of it. So we are looking for certain problems and solution. Now whatever fix a bug using a deterministic algorithm decision.

Instructor:Yes, we can fix using a deterministic and then it will go and fix those things deterministically. But those areas which are non-deterministic and it's very difficult to fix by the if else or the kind of a complex conditional rule. So to solve that problem and to have give a little bit of beginning space. I said, what is what is software midpoint zero. It's a cavity wall to storm brick layer. Between that, there is a middling space or the middle hollow space where things can actually make shuffle,

Instructor:make a right decision. How to handle it. For example, if the outside it's very heating, heated environment. The middle insulation, the cavity wall will ensure that heat does not transmit directly on other side of it. It is a single brick wall. It would. On the same side, if we got some noisy, um, a party going on inside the house, the cavity wall will also ensure because of the absorption of, um, um, the audio audio waves, um, and it could reflect each other.

Instructor:And outside, it won't be static. Um, people won't be able to hear that much. And that's the benefit of it. So what it does, it solves that purpose. It fixes the mess in between, and that's the same thing. Whenever the data comes through, or it they can use the same principle. Whenever the sending data was some part, the the limit of contract flexibility is available as that regulating space and that regulating space or that middle office space is filled with the AI. When the AI can make those decision,

Instructor:and that's where we are choosing what scope of a problem and those mutations. We will be generated by LLM itself based on the mutation we have identified, and those mutations will be fixed. Twenty-five mutations are fixed by the playbook, which was generated using LLMs, and then added to the master CSV, which is basically our output file, and it looks pretty much simple. So let's say it it looks pretty much structured, yeah.

Instructor:And what mutation failed that generated signals, but I already shared with the better way to access it. Which is a streamlit, streamlit dashboard which will open it later. And some baseline code that T C Z when functions we will run, but I would say go and have read about it before we go. I have left some of the some of the code anchors here. You can see I have left some of the code anchors. You can take them and just see what those code anchors look like.

Instructor:Um, sorry, this one. You know for me. Using wrong shortcuts. Anyways, the clean_data_runtime here, what it does, you have some sort of a, and this finding the file as a, learner facing point, and this is where the actually deterministic loop, basically start. This normalization, which is the mutation playbook. However, this normalization actually start, first as a deterministic, so it try to normalize, using a deterministic approach. If the amount text does not normalize, then try to resolve the mutation rule.

Instructor:Once it start resolving the multiple mutation loop, and those mutation rule rules are available in basically. Um data set, so we have already defined what are the mutation rule, and this is what the bound it does is, so we providing mutation in a long way, the prompt and the skills, and of course you can add more things is, is you want. As a context, there is nothing, nothing stops you, how to use algorithms, but I'm using, I'm using the simple using a mutation hint and action.

Instructor:So that's how the mutation, playbook actually get generated, which is so here. Sorry, that something, we saw here in execution flow executed. Yes, the mutation playbook, and then mutation got fixed. So that's where we are basically. Um. See the code, read the code, and understand what is happening. The export writer, it say this order. If we go in this order, export writer two and two, it explain how we actually generate those exported values as well. Um, so let's go and basically see if we got status valid.

Instructor:The first milestone is to check everything is configured. Yes, my version is Phi-4, and there is a reason why I'm using the lower end of LLM. It's because this approach is very useful if you can actually host local or or cost-effective LLM interfaces. However, if you got a use case where you want to actually use more advanced LLM, then always prefer that whichever fits your case. So you don't have to just consider like okay, it's better to go for SLMs or medium or large.

Instructor:It's a case by case basis, but because I use it, because I prefer, I'm not doing that complex operation which require additional additional intelligence that I have to pay for. So I'm using it. I mean, my organization as well. The ninety percent of our self correcting loops and all this software three point zero basically SLMs, and those SLMs are actually hosted by us using our own infrastructure in cloud, and it's most cost-effective way to handle it. So let's see it's working fine.

Instructor:Brilliant. Let me just go and verify it. So we make sure that our LLMs actually connects. Oops, sorry. Wrong connectivity. Get. Let's see everything fine. Then I'm going to reset it. Well, the so it's connecting. Let's reset. Hopefully, reset as done, and let's call the event. So last time we called the loop. Loop will make it run end to end. Event will basically come to this point of example that we saw. So we can see here what it did. It actually passed all the mutation planning and everything that.

Instructor:What it found, it matched the what it expected, based on the twenty-five from the board's validation, and then it generated the particular genome file. So you want to see here genome file is is in in in in actually it runs dynamically. So so the code is not actually saved. That's what we are not doing that in this particular example. But you can always use worktrees. Worktrees are the better option for production implementation. We use work trees. So it always creates a worktree,

Instructor:run the worktree and/or roll back the worktree whenever it needs. Don't go with the branches otherwise you will inflate the git history a lot. The work trees are better, it runs locally and then it just destroy itself. If you need to check in then we bring it in and we keep the mutation alive. Or we can actually adopt that mutation into permanent startup point as well as a starter genome as well. So that's something is called something is called the habitual learning.

Instructor:So once we understand what is a habitual learning from the mutation, mutation and actually also make autonomous decision how to change the genome starting point. So currently a genome starting point is very simple. It's a given static point. But in future if you find that these are the these are the repetitive offenders then behavioral learning is very important as well. It's slightly different from reinforcement learning.

Instructor:We pick it up in the latest stage because there are many different many different approaches and strategies. Then apply in this course we'll keep the code mutation at this stage. But I'm going to do more courses where we would do behavioral mutation using the prompt mutation and the skin mutation as well, which is both. And in terms of adopting the behavioral mutation along with the code mutation as well. So there there are three or four different elements of this particular pattern,

Instructor:and we take it one by one as we finish the courses. But this is what our hands on look like. I would also recommend you to. Um, both for ah exercises in this case, um, which is start with simple, go how far you can go and reset as many times you want, and uh from a dashboard point of view, um, you just run this command `python util.py dashboard` and that will bring the dashboard. Um, here this will it, um, browser, supporting browser and this is a dashboard, so you can always go through here.

Instructor:It will it will have all the sort of data quality validation which we have found, and we provide you everything that you need. So yeah, that's that's all for this particular hands-on session. Let's go back to our labs and see what um what remaining parts we need to discuss. Well, okay, so let me go back to the labs. Welcome back at this point, the genome is clearly defined. The loop now as a concrete place to apply some pressure, and you have a boundary that you can define.

Instructor:And as I said, the boundary is not just physical but also the function, especially from problem domain boundaries as well. Next lesson, we will bind up the orchestrator, and that's take the final evidence and turn it into the candidate mutation verification and selection because it looks at the judge and everything else as well. And then once it has been validated, we decide whether we accept it or reject it. I mean to say, revert it. So we'll discuss that in detail with orchestrator.

Instructor:But thank you very much for joining, and I'll see you in the next lesson.

Define the Genome Boundary

0/3

Open the genome and the fixed judge side by side

Read the mutable genome next to the immutable referee so the mutation boundary is visible before you run anything.

bash

cd _examples/self-improving-agent/cleanloop
code clean_data.py prepare.py

Trace the runtime handoff

Inspect the runtime and loop files so you can see where deterministic cleanup ends and the mutation playbook takes over.

bash

code clean_data_runtime.py loop.py

Run the evaluation and inspect the dashboard

Validate the setup, run the bounded evaluation flow, and inspect the evidence surface from the dashboard.

bash

python util.py status
python util.py verify
python util.py evaluate
python util.py dashboard

nilayparikh/tuts-agentic-ai-examples/tree/main/self-improving-agent/cleanloopGitHub

Complete source code for this lesson.

github.com/nilayparikh/tuts-agentic-ai-examples/tree/main/self-improving-agent/cleanloop

Q&A

Q & A

Why insist on one mutable file if a repo has many moving parts?

Because the loop becomes reviewable only when failures map back to a small diff surface. One file is not a universal rule, but it is the safest default when the problem can be bounded that tightly.

What does bounded mutation actually bound in this lesson?

It bounds both the physical file surface and the business problem surface. CleanLoop is not trying to become a universal data engineer. It is trying to repair a narrow class of finance CSV inconsistencies.

Why bring up worktrees before the orchestrator lesson?

Because the recording wants the learner to see early that mutation should run in an auditable workspace with a clean rollback path, not as unchecked branch churn.