I am still greatly disappointed by the insistence on end-to-end black-box models...

charcircuit · on March 18, 2022

>in favor of fast but potentially wrong or insecure results.

Formal methods don't eliminate wrong or insecure results. Formal methods tell you that a program matches the specification when certain conditions are true eg. no bit flips, the computer does not crash, the kernel doesn't kill your process, that allocations succeed, that your program can make progress (the kernel can decide to never schedule your program), etc. You can have bugs in writing the specification where the specification does not match your intention. Even your intention of a system may have vulnerabilities in it. If your specification can't generate code, your code might not match the specification.

Using formal methods slows you down compared to things like testing which can get us most of the way there in less time. Systems can be designed to be robust such that if a machine fails the system keeps on running. If a end-to-end black-box model can get you most of the way there with a sufficiently low number of bugs it may be worth it to use. Time is a limited resource and being 100% correct is not necessarily better than being 99% correct and having extra features.

>The disconnect between working code and correct or secure code respectively is going to widen using this approach.

Not really. People are not going to just start ignoring bugs when they run into them because the software they are using happened to be machine generated.

>Instead of turning the uploaded image into a mosaic as intended, the generated code simply creates a fixed-size black-and-white checker-board pattern

It worked fine for an image I just tried. Just like the prompt it "convert[ed] the image to a 32x32 mosaic." There was no checkerboard, but it may be worth noting that it converted transparent pixels to black.

qayxc · on March 18, 2022

> Formal methods don't eliminate wrong or insecure results.

Yes, they do. That's the entire purpose of a proof. Of course formal methods cannot prevent or even detect wrong specifications, but that's no different from generated code either.

> Using formal methods slows you down compared to things like testing which can get us most of the way there in less time.

But that's the point - formal methods are slow if people have to apply them. Automated theorem provers exist and can work on generated ASTs, so why not add the step and create a hybrid system that verifies the generated result?

>> Instead of turning the uploaded image into a mosaic as intended, the generated code simply creates a fixed-size black-and-white checker-board pattern

> It worked fine for an image I just tried. Just like the prompt it "convert[ed] the image to a 32x32 mosaic." There was no checkerboard, but it may be worth noting that it converted transparent pixels to black.

The code may have been changed - it works for me now, too, when before it definitely didn't.

charcircuit · on March 18, 2022

>Of course formal methods cannot prevent or even detect wrong specifications

A wrong specification can give you a wrong or insecure result. That was my point. Formal methods aren't a sliver bullet and your system still needs to be robust to failures.

>so why not add the step and create a hybrid system that verifies the generated result?

Because the time spent writing a specification is time wasted if there ends up being no issues with the generated code.

dibujante · on March 18, 2022

People are going to generate black-box code, launch it as an MVP, and then hire engineers to iron out bugs or even do whole rewrites if the product gets traction. It's all going to just fit back into the same standard model.

jgalt212 · on March 18, 2022

There's no way to un-black box it. There's just too many parameters. Call it black box or lack of model explainability. It's effectively the same thing.