Two Weeks of Tweaks
ODSC-AI Immersion
I’m a bit too influenced by hype and it makes me nervous. Plus I keep forgetting to go to MIT’s website and pay attention there. When I got the Delta variant of COVID I was completely freaked out until I went to MIT, and it turns out that they speak good English, moreover they know what they’re talking about, plus they have a coherent presentation for well understood phenomena. Who could ask for more?
Well I haven’t been to MIT since COVID and consequently I’ve been listening to recruiters and LinkedIn and Fireship and Nate B. Jones and Ilya and Tyler and too many other people who are confusing me. What I really needed was some immersion and I found a new goto guy. His name is Zain Hasan. He’s the most coherent practitioner I’ve yet heard, although I’ve heard some others that I like. Kyle Stratis is the other guy who was a standout this week.
What I’ve learned over the past four days of sessions at ODSC-AI is enough about agentic theory and practice that makes most everything now sound obvious. What has been especially useful is hearing a dozen different people explain it that many different ways. Even though in summary they are not coherent and everybody is grappling with something at their level and from their angles, the fog is clearing.
Internals
Like with every database system I've learned over the years, you never get to understand performance until you have learned the internals, and it’s the internals of LMs that have been a mystery. So it was very useful to hear Hasan talk a level deeper on agentic processes as someone who.
Works for a company that provides components to the frontier providers.
Has been around the ML space long enough to have a good understanding of limits.
He makes it all sound simple when he says Tokens In, Tokens Out. For me that gives a more concrete kind of understanding as to what the interface between language model non-deterministic stuff and conventional API calling deterministic code is coherently like. When you could even think of AI applications as a kind of game loop that cycles around the following evolutions:
Text → Image
Image → Image
Speech → Text
Text → Text
Text → Speech
with the understanding that these are not necessarily one-shot database responses but iterations around the suitability of output tokens, it gets clearer. This is the job of the Agent Harness. So you as a programmer are deciding when the agent has done a good enough job. Also how much memory is used for the context window, whose size increases with each loop as token messages are passed back by the LM.
Hasan has given me a way to distinguish language models, in such a way that I will call them LMs in the future. Be they large or small, I will think of them as engines. As I discover tools like opencode to interact with them in the future, things will become clearer. What is clearer this week has affected my prompting skills. Here’s the essential lesson.
The New XML
I’m starting to use some declarative framing in my prompts, and it turns out that each LM has its own idiosyncratic sorts of special tokens that bracket the tokens in, and thus actions of the inference engine. So a lot of purpose-built prompts start with a kind of creation story.
<genesis>
I have created you in my image. I am your god. You are man.
</genesis>
<purpose>
You are to establish my kingdom on earth.
- you get to name the animals, plants and crawly things
- you get to rule over the animals, plants and crawly things
</purpose>
<|tool-call|>
This is a prayer. I'm bored. Can I have a companion?
</|tool-call|>The tool call is the agentic part. So that’s when you interact in your prompt to do something that the trained LM didn’t think of or know at the moment of their last training. In the above heresy, you could say that god didn’t know man was going to be lonely. It wasn’t in the original trained model. So god had to go out to the toolshed and put together a helpmeet. How big this toolshed is and how long the prayers are are limits to the system. You can easily make the context window too large, or poison it with contradictions.
As the models become more sophisticated, they will each find ways to deal with larger context window and where to store intermediary results for agentic loops that keep running for minutes, hours, days. There’s the art. So we have to get down into the internals of each LM to understand its pieces and parts, its specific special tokens (there is no standard), and what tools the LM has already stocked their shed with.
The Dark Forest
Hasan has informed us that Kimi-K2.5 is the open source model with the best tool caller / agent handler. So I found myself looking around for skills with the understanding that this is the new frontier. These are frameworks for doing this and that and sometimes arranging the best memory management for completing tasks, but that’s when we actually enter into the dark forest.
Take for example the hot and hyped Antigravity Awesome Skills. (git@github.com:sickn33/antigravity-awesome-skills.git) According to this collection of awesome prompts, the following describes one function of a data engineer.
### Data Modeling & Warehousing
- Dimensional modeling: star schema, snowflake schema design
- Data vault modeling for enterprise data warehousing
- One Big Table (OBT) and wide table approaches for analytics
- Slowly changing dimensions (SCD) implementation strategies
- Data partitioning and clustering strategies for performance
- Incremental data loading and change data capture patterns
- Data archiving and retention policy implementation
- Performance tuning: indexing, materialized views, query optimizationThat’s all folks. You could plug this in and blow your token budget just asking for a spec. This is where you use that phrase “he’s not wrong” knowing what fraction of people are going to expect AI systems to read between the lines and figure out how to do all of that. They most certainly can, but there are a hell of a lot of details, demons and devils in that potential rubric.
Again, in the context of vibe coding with someone who is already an expert in any of those particular techniques, one can be quite productive. But pull down the repo and you’ll see that there are 549 skills ‘fully described’ in less than 24MB. Are we so small?
Well, some of us are.
If you’re not familiar with this meme, it’s Clawdbot / Moltbot / OpenClaw. And this is all people need to know. They’ve hacked Western Civilization, see?




