I don't think we ever get away from the code being the source of truth. There ha...

Marazan · 2025-11-09T12:59:20 1762693160

> There are decisions you didn't realize you needed to make, until you get there.

Is the key insight and biggest stumbling block for me at the moment.

At the moment (encourage by my company) I'm experimenting with as hands off as possible Agent usage for coding. And it is _unbelievably_ frustrating to see the Agent get 99% of the code right in the first pass only to misunderstand why a test is now failing and then completely mangle both it's own code and the existing tests as it tries to "fix" the "problem". And if I'd just given it a better spec to start with it probably wouldn't have started producing garbage.

But I didn't know that before working with the code! So to develop a good spec I either have to have the agent stopping all the time so I can intervene or dive into the code myself to begin with and at that point I may as well write the code anyway as writing the code is not the slow bit.

trjordan · 2025-11-09T13:18:51 1762694331

For sure. One of our first posts was called "You Have To Decide" -- https://tern.sh/blog/you-have-to-decide/

And my process now (and what we're baking into the product) is:

- Make a prompt

- Run it in a loop over N files. Full agentic toolkit, but don't be wasteful (no "full typecheck, run the test suite" on every file).

- Have an agent check the output. Look for repeated exploration, look for failures. Those imply confusion.

- Iterate the prompt to remove the confusion.

First pass on the current project (a Vue 3 migration) went from 45 min of agentic time on 5 files to 10 min on 50 files, and the latter passed tests/typecheck/my own scrolling through it.

DeathArrow · 2025-11-09T16:55:28 1762707328

>Adversarially have an AI read through that huge implementation log and surface where it's struggling.

That's a good idea, have a specification, divide into chunks, have an army of agents, each of them implementing a chunk, have an agent identify weak points, incomplete implementations, bugs and have an army of agents fixing issues.

spot5010 · 2025-11-09T16:34:50 1762706090

The reason code can serve as the source of truth is that it’s precise enough to describe intent, since programming languages are well-specified. Compilers have freedom in how they translate code into assembly and two different compilers ( or even different optimization flags) will produce distinct binaries. Yet all of them preserve the same intent and observable behaviour that the programmer cares about. Runtime performance or instruction order may vary, but the semantics remain consistent.

For spec driven development to truly work, perhaps what’s needed is a higher level spec language that can express user intent precisely, at the level of abstraction where the human understanding lives, while ensuring that the lower level implementation is generated correctly.

A programmer could then use LLMs to translate plain English into this “spec language,” which would then become the real source of truth.

DeathArrow · 2025-11-09T16:57:16 1762707436

What about pseudocode? It is high level enough.

spot5010 · 2025-11-09T17:40:18 1762710018

Right, but it needs to be formalized.

dennisy · 2025-11-09T14:45:56 1762699556

Tern looks very interesting.

On your homepage there is a mention that Tern “writes its own tools”, could you give an example on how this works?

trjordan · 2025-11-09T21:35:21 1762724121

If you're thinking about, e.g. upgrading to Django 5, there's a bunch of changes that are sort of code-mod-shaped. It's possible that there's not a codemod for it it that works for you.

Tern can write that tool for you, then use it. It gives you more control in certain cases than simply asking the AI to do something that might appear hundreds of times in your code.