← People
Erik Craddock
Erik Craddock@eriklink

Cybersecurity Looks Like Proof of Work Now

Code remains cheap, unless it needs to be secure. Even if costs go down as inference optimizations, unless models reach the point of diminishing security returns, you still need to buy more tokens than attackers do. The cost is fixed by the market value of an exploit.

Cybersecurity Looks Like Proof of Work Now

Drew Breunig

Cybersecurity Looks Like Proof of Work Now

Is security spending more tokens than your attacker?

linkby Drew Breunigvia Drew Breunig
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

The 2nd Phase of Agentic Development

I think we’re going to see a lot more reimaginings, where people attack old problems with modern tactics. Coding agents lower the costs of taking on stalwarts and raise our ability to rapidly harden our software. I can think of many software tools that people rely on but don’t like. Those are the prime targets for reimagining.

The 2nd Phase of Agentic Development

Drew Breunig

The 2nd Phase of Agentic Development

Moving from clones to reimaginings.

linkby Drew Breunigvia Drew Breunig
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

The Cathedral, the Bazaar, and the Winchester Mystery House

This certainly describes the primary way I use AI agents.

There is only one source of feedback that moves at the speed of AI-generated code: yourself. You're there to prompt, you're there to review. You don't need to recruit testers, run surveys, or manage design partners. You just build what you want, and use what you build.

And that's what many developers are doing with cheap code: building idiosyncratic tools for ourselves, guided by our passions, taste, and needs.

The Cathedral, the Bazaar, and the Winchester Mystery House

Drew Breunig

The Cathedral, the Bazaar, and the Winchester Mystery House

Welcome to the era of sprawling, idiosyncratic tooling.

linkby Drew Breunigvia Drew Breunig
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Learnings from a No-Code Library: Keeping the Spec Driven Development Triangle in Sync

I think he is on to something here. I've believed for awhile that the best way to learn to use agents and llm's for writing code it to focus more changing the spec instead of modifying mistakes in the code. This does work for smaller projects but can be problematic to say the least. This is a realistic solution to the problem.

Code implementation clarifies and communicates intent. I could stop there and walk out of the room. I missed this with whenwords.

The job is to keep specs, code, and tests in sync as they move forward. The system for managing that has to stay simple. If it creates developer mental overhead, it just moves the problem somewhere else.

The act of writing code improves the spec and the tests. Just like software doesn’t truly work until it meets the real world, a spec doesn’t truly work until it’s implemented.

No-code libraries are toys because they are unproven.

Even if you aren’t the one making decisions during implementation, decisions are being made. We should leverage LLMs to extract and structure those decisions.

And finally: we’ve been here before. The answer then was process. The answer now is also process. And just as we leverage cloud compute to enable CI/CD for agile, we should leverage LLMs to build something lightweight enough that we can fit in our heads, doesn’t slow us down, and helps us make sense of our software.

Learnings from a No-Code Library: Keeping the Spec Driven Development Triangle in Sync

Drew Breunig

Learnings from a No-Code Library: Keeping the Spec Driven Development Triangle in Sync

The following is a write up of a talk I delivered at MLOps Community’s “Coding Agents” conference, on March 3rd. There’s a video version of the talk available on YouTube.

linkby Drew Breunigvia Drew Breunig
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

The Potential of RLMs

The key attribute of RLMs is that they maintain two distinct pools of context: tokenized context (which fills the LLM's context window) and programmatic context (information that exists in the coding environment). By giving the LLM access to the REPL, where the programmatic context is managed, the LLM controls what moves from programmatic space to token space.

And it turns out modern LLMs are quite good at this!

The Potential of RLMs

Drew Breunig

The Potential of RLMs

Handling Your Long Context Today & Designing Your Agent Tomorrow

linkby Drew Breunigvia Drew Breunig
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Don't Fight the Weights

Today, in-context learning is a standard trick in any context engineer’s toolkit. Provide a few examples illustrating what you want back, given an input, and trickier tasks tend to get more reliable. They’re especially helpful when we need to induce a specific format or style or convey a pattern that’s difficult to explain1.

When you’re not providing examples, you’re relying on the model’s inherent knowledge base and weights to accomplish your task. We sometimes call this “zero-shot prompting” (as opposed to few shot2) or “instruction-only prompting”.

Don’t Fight the Weights

Drew Breunig

Don’t Fight the Weights

When your context goes against a model’s training, you struggle to get the output you need. Learn to recognize when you’re fighting the weights so you can do something different.

linkby Drew Breunigvia Drew Breunig
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Initial Thoughts on GPT-OSS | Drew Breunig

There’s two schools of thought when it comes to agent building.

Some people think you should shove your entire task into a giant model and let it sort it out, with plenty of thinking. It’s expensive, it’s slow, but it (allegedly) requires less upfront work.

Others think you should design your task, in composable steps, where you can measure the accuracy of each step. For most steps, you only need a small model! You don’t need o3 to churn through 3 minutes of tokens to summarize an email body or detect sentiment.

Initial Thoughts on GPT-OSS

Drew Breunig

Initial Thoughts on GPT-OSS

OpenAI released its open-weight model, gpt-oss, today. It comes in two sizes, 120B and 20B, the latter of which runs briskly on my Mac Studio. I’m sure I’ll have more impressions as I use it in anger over the next few weeks, but here’s my initial thoughts:

linkby Drew Breunigvia Drew Breunig
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Delegation is the AI Metric that Matters | Drew Breunig

The gap between social acceptance and expert acceptance of delegating a given task to AI is a point of negotiation that will occur more often, in more domains over the coming years. Watch these points of friction to better understand the distribution of AI. First as a sign that AI is performing a task sufficiently against expert standards. And second, as a sign that either regulation will arrive or cultural innovations are needed to enable the technical ones.

Delegation is the AI Metric that Matters

Drew Breunig

Delegation is the AI Metric that Matters

Forget the benchmarks – the best way to track AI’s capabilities is to watch which decisions experts delegate to AI.

linkby Drew Breunigvia Drew Breunig
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

How Long Contexts Fail | Drew Breunig

The arrival of million-token context windows felt transformative. The ability to throw everything an agent might need into the prompt inspired visions of superintelligent assistants that could access any document, connect to every tool, and maintain perfect memory.

But as we’ve seen, bigger contexts create new failure modes. Context poisoning embeds errors that compound over time. Context distraction causes agents to lean heavily on their context and repeat past actions rather than push forward. Context confusion leads to irrelevant tool or document usage. Context clash creates internal contradictions that derail reasoning.

How Long Contexts Fail

Drew Breunig

How Long Contexts Fail

Taking care of your context is the key to building successful agents. Just because there’s a 1 million token context window doesn’t mean you should fill it.

linkby Drew Breunigvia Drew Breunig
0 Replies0 Boosts0 Likes