Great achievement, but what a horrible future we are facing. Instead of progress...

UK-Al05 · on March 16, 2024

Because test define the behaviour. The hardest part of requirements is describing exactly what you want. Tests are great way of doing that.

I think AI software development is going to involve writing tests, which the AI agents then get passing. Or some other requirements language that allows for exactness where English can fall down.

sgt101 · on March 16, 2024

My experience is that the more complex a loss function that you write for optimisation the lower the likelihood of a "natural" or robust result. So, the software is highly likely to work for all the test cases, and nothing else.

This is kinda the problem that Tesla has for FSD; they are endlessly patching it and it's endlessly finding ways to go wrong.

UK-Al05 · on March 16, 2024

You have describe tests in a way doesn't use example based testing.

Think property based testing. That way it can't overfit.

smokel · on March 16, 2024

It all depends of course, but I tend to disagree.

A requirement such as "A web interface to play the game of chess, but with all the pieces replaced by photos of my family members" is fairly adequate if I were to make a Christmas family game.

I am totally uninterested in specifying whether it is possible to have two white bishops on black squares. Also, I don't want to test whether my uncle's moustache is the proper size.

I'd much prefer to iteratively build the game by prompting than by specifying thousands of details. It just seems the wrong way around.

__loam · on March 16, 2024

So we get to do the shitty part.

throwuwu · on March 16, 2024

If it makes you feel better, the AI will write all the code for the tests and come up with all the variations and do all the fuzzing to try to break things. You’ll just have to do a good job of explaining the requirements to it and adjusting them as it becomes clear you didn’t fully describe the outcome you wanted the first time around.

UK-Al05 · on March 16, 2024

It's shitty for the same reasons why AI will be bad at it. It involves talking to people, and trying to resolve ambiguous requirements.

klysm · on March 16, 2024

Code _is_ the description of the behavior. Turns out describing behavior is really hard. If you cover all the cases, it’s not that far from code. Furthermore, the data structures are a huge decision space which depends on context that is hard to communicate.

rixed · on March 16, 2024

Technically, the future does not have to be so gloomy. Soon the IA may built for itself the better tools we failed to build (or adopt). One can expect centralised intelligence to surpass collective intelligence, after all.

sinuhe69 · on March 16, 2024

I would say it's debatable whether centralised intelligence can outperform collective intelligence. Centralised intelligence can be very efficient, but by definition it will lack the perspectives that a diverse collective intelligence can offer. In the long run, if the search for global optima is the ultimate goal, a diverse collective intelligence will have a higher chance of success than a centralised intelligence, especially in multidimensional spaces.