There have been many documented cases of supply chain attacks of various degrees of sophistication.
Some of them successful, some of them almost successful.
May I remind you of the recent xz vulnerability was discovered by a single dev by mere chance.
As an end user it is nearly impossible to guard against such an attack.
It can be problematic to run something like `curl foo.com | bash` without inspection of the script first.
But even here it makes a difference if you are curling from a project like brew.sh that delivers such script from a TLS protected endpoint or some random script you find somewhere in a gist.
Same goes for output from an LLM. You can simply investigate the generated command before executing it.
Another strategy might be to only generate the parameters and just pass those to the ffmpeg executable.
This is the crux of our disagreement. It does not go the same. You have no idea what the LLM is going to write, neither does the LLM, nor the people who created the LLM.
At no point did the people who created the LLM actually think about your use-case, nor did the LLM, and there is no promise of anything you ask getting a correct, or even consistent answer. The creators don't know how the answers got there, and can't easily fix them if they're wrong. You'd be a fool to trust it for anything other than dog and pony shows.
There is an observable and testable probability that an LLMs output is correct or false.
There is an observable and testable probability that code created by humans is erroneous.
There is a third category where human code is malicious.
You will need to guard against all three cases.
Guarding against a possibly faulty LLM output that is passed to ffmpeg is significantly easier to realise through defensive prompting and simple sanitization techniques, than to protect against a malicious state actor that is capable of crafting sophisticated supply chain attacks that took years to develop and roll out (again see the xz backdoor).
The idea that you can blindly trust an open source project just because "their development happens collaboratively and in the open" is naive.
Some open source projects people are building their whole infrastructure on are maintained by a single developer in their spare time.
The attack surface is huge and has been exploited time and time again.
Just because a project like Debian has signed packages and a security team doesn't guarantee that the underlying code doesn't have some malicious backdoor or some grave bug that creates a big attack surface.
Nobody said anything about blind trust, but that's what's being exhibited here, in trusting the output of a stochastic parrot that can't even reason.
You've _really_ shifted the goalposts here.
Do you trust in _your_ ability, to what you think requires no more than "defensive prompting and simple sanitization techniques"... to be robust against, say "a malicious state actor that is capable of crafting sophisticated supply chain attacks"?
You know there's not just one, right? And you know if you're consuming a closed product, you can't even verify its correctness for yourself, let alone be able to tell if a "malicious state actor" is actually the one sending you LLM answers. You can't follow along with the development process. Its actors don't make their changes in public. You can't look back a history of all their actions.
A "malicious state actor" would laugh at your "simple sanitization" and use logic and reason to know where your code is vulnerable and change what you think will be an ffmpeg command into something that actually probes your network, downloads all the files, encrypts them and posts you a ransom note from your own mailservers.
When both scenarios have bad actors and attack surfaces, which would you rather do:
1. Look up the ffmpeg manual, or ask a search engine and find StackOverflow answer, or heck even ask an LLM... but then go through the manual and _understand the command_ you're running, and what its human authors have written about its capabilities. Ensure you use the correct settings and you know what they do and you've reasoned as to why they're correct - and put that in your script.
2. Make no attempt to understand ffmpeg. Put a command in your script that makes a network call to a proprietary service you don't control, and 100% put your faith that it _always_ returns the correct command for the same prompt - each time you run it. And that service never gets interrupted. And that service is never hacked, nor its staff compromised, nor its models poisoned, etc.
Honestly, this is as braindead as people using PHP fopen() to access URLs for files they could host locally.
EDIT: bonus question. Would you ask an LLM "please send me an ffmpeg binary for linux x86_64 that automatically splits the output from /dev/video0 into timeslices" ? If it gave you a binary back, would you run it in preference to the normal ffmpeg binary with a provenance of where it came from?
I'd say "discovered by a single dev" is not just mere chance, but system working as designed.
- Everyone was getting the same package, so one person could warn others
- There were well-established procedures for code updates (Andres Freund _knew_ that xz was recently updated, and could go back and forth in previous versions)
- There was access to all steps of the process - git repo with commit history, binary releases, build scripts, extensive version info
None of this is true for LLMs (and only some of this is true for curl|bash, sometimes) - it's a opaque binary service for which you have no version info, no history, and everyone gets a highly customized output. Moreover, there has been documented examples of LLM giving flawed code with security issues and (unlike debian!) everyone basically says "that person got unlucky, this won't happen to me" and keeps using the very same version.
So please don't compare traditional large open-source projects with LLMs - their risk profiles are very different, and LLM's are a way more dangerous.
As an end user it is nearly impossible to guard against such an attack.
It can be problematic to run something like `curl foo.com | bash` without inspection of the script first. But even here it makes a difference if you are curling from a project like brew.sh that delivers such script from a TLS protected endpoint or some random script you find somewhere in a gist.
Same goes for output from an LLM. You can simply investigate the generated command before executing it. Another strategy might be to only generate the parameters and just pass those to the ffmpeg executable.