← Blog

Snowflake & BuildersCTO Network Recap: Spec-Driven Development

CTO

6 min

07 May 2026

When a senior AI engineer tells a room of 40 CTOs that his team stopped writing code in January and hasn't looked back, the questions come fast.

Intro

Recap

Business idea scope

When a senior AI engineer tells a room of 40 CTOs that his team stopped writing code in January and hasn't looked back, the questions come fast. That's exactly how the morning went at Snowflake's Amsterdam HQ, where Builders CTO Network gathered for a deep dive into spec-driven development, and the conversation never really settled.

Setting the room

The Snowflake office filled up quickly on a Tuesday morning. About forty CTOs and engineering leaders from across the Netherlands, a handful from further afield. The usual formula: strong coffee, no pitches, no suits pretending to talk shop. Just the building part. As always, titles stayed outside.

Too many rooms full of people who manage technology. Not enough rooms full of people who build it. That gap is why CTO Network exists.

The featured talk came from Martijn, Senior AI Engineer at Altura, who walked the room through a fully spec-driven development workflow his team has been running since January 2026, one in which, as he put it plainly, nobody writes code by hand anymore. It took about ninety seconds before the first hand went up.

What spec-driven actually means in practice

The concept is simple on paper. Before any code is touched, engineers generate markdown files describing the changes they intend to make. Those specs get reviewed and approved. Only then does an AI coding tool execute the changes. Each feature produces two pull requests: one for the spec, one for the generated code.

Simple enough that half the room had experimented with some version of it. Different enough in execution that almost no two teams had landed in the same place.

Some had tried it on greenfield projects and found their velocity double. Others had run into walls almost immediately, legacy APIs that behaved differently from their documentation, boundary code where the spec-to-output gap was too wide to trust. A CTO who'd implemented it at a large media company described the process working smoothly on the application layer while quietly abandoning it for anything that touched their telco integrations, dozens of proprietary protocols that even senior engineers had to reverse-engineer by trial and error. The pragmatic call: spec-driven where it works, hand-crafted where it doesn't, no ideology about it.

The room's relationship with the approach was somewhere between committed and skeptical, which made for a better morning than consensus would have.

Where does the engineering judgment go?

The sharpest early question came from someone who'd been watching their own team's behavior shift: if engineers aren't writing code, where does the craft actually live?

The answer the room kept returning to was decomposition. Breaking a feature into the right-sized issues — not so broad that the AI produces a tangle, not so narrow that the overhead kills the point — is where senior engineering experience now earns its keep. It's less legible than writing elegant code, harder to teach, and almost impossible to evaluate in a hiring process. But it's the bottleneck. Get it wrong and the spec-to-code pipeline produces things that technically work but are architecturally wrong. Get it right and you can have a frontend engineer shipping confident specs for .NET services they've never written before.

Which led to the question nobody had a clean answer for: what happens to junior engineers in this model? Several people in the room had already made the same quiet call: stop hiring juniors, double down on seniors. The decomposition skill comes from years of owning production software, from having been the person responsible when something breaks at 2am. If you've never had that, spec-driven development doesn't build it. The room flagged this as a real industry-level risk, not just a hiring preference.

Sacred code: what the AI should never touch

One of the roundtable sessions organized itself quickly around a single question: are there parts of your codebase that are simply off-limits?

Almost every team had them. Custom authorization modules built before modern auth frameworks existed. Integration layers with enterprise systems that had never been properly documented. Legacy services where even the engineers who wrote them were no longer sure what the edge cases did. The common thread wasn't that these parts of the codebase were particularly complex, it was that they were fragile in ways that weren't visible until something touched them.

The emerging view was that spec-driven development doesn't create this problem; it exposes it. If the AI generates code for a legacy interface and it breaks in unexpected ways, that interface was already fragile. The AI just made the fragility cheaper to discover. That reframe shifted how several people in the room were thinking about their own guardrails. Not as permanent fences around dangerous code, but as honest signals about technical debt that was already there.

One team described their rule simply: the AI gets full read access everywhere, including infrastructure, logs, and environment variables up to staging. Write access stops at production. Not because they don't trust the model, but because the human review loop isn't fast enough yet to catch mistakes at production speed.

The cost question

At some point the conversation landed on the number that makes finance teams uncomfortable: what does running a fully AI-assisted engineering team actually cost in tokens?

The answer from teams who'd done the math was consistently less alarming than expected. One team put it at roughly €200 per engineer per week, meaning a twelve-person team runs on approximately one junior hire's worth of AI budget. The room let that comparison sit for a moment.

That said, the teams tracking costs carefully had found redundancy quickly. One example: a cloud bot scanning repositories for issues was consuming a fifth of the monthly AI budget while the local Claude setup was already doing the same thing. The fix wasn't to cap engineers, it was to stop paying for the same work twice.

Specs as a product communication layer

Something that came up in Martijn's talk and kept resurfacing in the roundtables: spec-driven development turns out to solve a problem that has nothing to do with code generation.

In the old model, product disagreements surface after something is already built. A UX flow is live; changing it is expensive; the conversation should have happened six weeks earlier but didn't, because the decisions were invisible until they were in production. With a two-PR system — spec first, code after — non-technical team members are reviewing decisions before a line of code exists. Product managers and designers are in the same cycle as engineers, not waiting for a demo.

Several leaders in the room had noticed the same thing independently: when the spec is the artifact, the conversation moves earlier. Not because anyone mandated it, but because the spec is readable. It's a document, not a diff.

Memory, context, and keeping the model on the rails

The roundtables circled back repeatedly to the practical question of how you keep an AI coding tool consistent across a large, evolving codebase over weeks and months.

The pattern that multiple engineers had arrived at independently: a set of markdown files in the monorepo — separate documents for testing conventions, architecture decisions, coding practices — that get updated whenever the model makes a mistake. Not just corrected in the moment, but codified. Ask the model to write down what it learned. Put that in the repo. The next session inherits it.

One engineer is going a step further. They're building a post-mortem tool that runs at the end of each session and asks the model to review its own transcript, extract what it learned, and add it back to the appropriate file. The goal is a feedback loop that compounds without anyone needing to manually distill it.

On context window management, the room had converged on a similar workaround: don't let the window auto-compact mid-task. Switch to a fresh session before it collapses, and pass a clean status summary as the handoff. It adds a small amount of toil; it avoids the larger problem of a model that's lost the thread of what it was doing.

The compiler analogy

The closing exchange was the one that stuck.

One participant framed the whole transition this way: we are in the middle of a next-generation compiler revolution. Engineers who wrote assembly code rejected the first high-level compiler because the abstraction felt wrong, the output untrustworthy, the determinism gone. The same resistance is playing out now, just at a higher level of the stack. And the same answer applies: the guardrails we have now are calibrated to the current maturity of the tools, not the tools that are coming.

The room broadly agreed, while also noting that "the model will get better" is not an argument for removing guardrails today. Several people had run into model updates that changed behavior in ways that broke things they thought were stable. The practical response most teams had landed on: treat model upgrades the way you treat dependency upgrades. Test before you trust. Don't assume the new version does what the old one did.

What loosens the guardrails? Better evals. Better observability. Longer track records. And, gradually, models that follow the rules more reliably than the engineers reviewing them.

Until next time

The next CTO Network event is on 20 May in Rotterdam — same subject, more depth, and a room full of people who will have had another month to find out where their own spec experiments break.

Hosted by Builders, with thanks to Snowflake, Inkef and Altura for making the morning possible.

OUR UNIVERSE

BUILD WITH US

Backing bold founders from day zero — pairing ambition with deep technical leverage, operational firepower to create category-defining companies.

Build with us ↗
CTO NETWORK

Curated Network for Europe’s to builders — from AI to deep tech. Private events, shared signals, and deep conversations for those who are in it.

Explore CTO Network → ↗

OUR UNIVERSE

BUILD WITH US

Backing bold founders from day zero — pairing ambition with deep technical leverage, operational firepower to create category-defining companies.

Build with us ↗
CTO NETWORK

Curated Network for Europe’s to builders — from AI to deep tech. Private events, shared signals, and deep conversations for those who are in it.

Explore CTO Network → ↗

Techno optimist?

Join our CTO NETWORK.

Leverage your deep expertise to craft breakthrough magical solutions.

Explore the Network →

Got a Venture idea?

Pitch it at Builders.

Are you a future CTO or CEO, sitting on a valuable business proposition for the future of Work and AI? We are here to listen and give the support you need to make it happen.

Pitch a venture ↗

Join next Investor house event

Builders Investor House brings together operator angel investors involved in the venture studio space.

Request an invite ↗

Side reads