This depends a lot on what you're doing with LLVM. If you are just using LLVM as a code generation backend for your language frontend, you generally do not need an LLVM fork.
For example, while Rust does have an LLVM fork, it just exists for tighter control over backports, not because there are any modifications to LLVM.
Is it? I think Rust is a great showcase for why it isn't. Of course it depends somewhat on your compiler implementation approach, but actual codegen-to-LLVM tends to only be a tiny part of the compiler, and it is not particularly hard to replace it with codegen-to-something-else if you so desire. Which is why there is now codegen_cranelift, codegen_gcc, etc.
The main "vendor lock-in" LLVM has is if you are exposing the tens of thousands of vendor SIMD intrinsics, but I think that's inherent to the problem space.
Of course, whether you're going to find another codegen backend (or are willing to write one yourself) that provides similar capabilities to LLVM is another question...
> You bootstrap extra fast, you get all sorts of optimization passes and platforms for free, but you lose out on the ability to tune the final optimization passes and performance of the linking stages.
You can tune the pass pipeline when using LLVM. If your language is C/C++ "like", the default pipeline is good enough that many such users of LLVM don't bother, but languages that differ more substantially will usually use fully custom pipelines.
> I think we'll see cranelift take off in Rust quite soon, though I also think it wouldn't be the juggernaut of a language if they hadn't stuck with LLVM those early years.
I'd expect that most (compiled) languages do well to start with an LLVM backend. Having a tightly integrated custom backend can certainly be worthwhile (and Go is a great success story in that space), but it's usually not the defining the feature of the language, and there is a great opportunity cost to implementing one.
It is a “trap” in the sense that it limits the design space for your language by forcing some of the choices implicit in C++. Rust went with slow build times like C++ and so was not held back in this dimension by LLVM.
Nothing about LLVM is a trap for C++ as that is what it was designed for.
At least at a skim, what this specifies for exposure/synthesis for reads/writes of the object representation is concerning. One of the consequences is that dead integer loads cannot be eliminated, as they may have an exposure side effect. I guess C might be able to get away with it due to the interaction with strict aliasing rules. Still quite surprised that they are going against consensus here (and reduces the likelihood that these semantics will get adopted by implementers).
That type punning through memory does not expose or synthesize memory. There are some possible variations on this, but the most straightforward is that pointer to integer transmutes just return the address (without exposure) and integer to pointer transmutes return a pointer with nullary provenance.
> I guess C might be able to get away with it due to the interaction with strict aliasing rules.
But not for char-typed accesses. And even for larger types, I think you would have to worry about the combo of first memcpying from pointer-typed memory to integer-typed memory, then loading the integer. If you eliminate dead integer loads, then you would have to not eliminate the memcpy.
That's a great point. I initially thought we could assume no exposure for loads with non-pointer-compatible TBAA, but you are right that this is not correct if the memory has been laundered through memcpy.
I don't imagine that the exposed state would need to be represented in the final compiler output, so the optimiser could mark the pointer as exposed, but still eliminate the dead integer load.
Or from a pragmatic viewpoint, perhaps if the optimiser eliminates a dead load, then don't mark the pointer as exposed? After all, the whole point is to keep track of whether a synthesised pointer might potentially refer to the exposed pointer's storage. There's zero danger of that happening if the integer load never actually occurs.
I guess the internal exposure state would be “wrong” if the compiler removes the dead load (e.g in a pass that runs before provenance analysis).
However, if all of the program paths from that point onward behave the same as if the pointer was marked as exposed, that would be fine. It’s only “wrong” to track the incorrect abstract machine state when that would lead to a different behaviour in the abstract machine.
In that sense I suppose it’s no different from things like removing a variable initialisation if the variable is never used. That also has a side effect in the abstract machine, but it can still be optimised out if that abstract machine side effect is not observable.
(Never mind, I misread you comment at first.) Yes, the representation access needs to be discussed... I took a couple of years to publish this document. More important would be if the ptr2int exposure could be implemented.
One peculiar thing about the benchmark results is that disabling individual UB seems to fairly consistently reduce performance without LTO, but improve it with LTO. I could see how the UB may be less useful with LTO, but it's not obvious to me why reducing UB would actually help LTO. As far as I can tell, the paper does not attempt to explain this effect.
Another interesting thing is that there is clearly synergy between different UB. For the LTO results, disabling each individual UB seems to be either neutral or an improvement, but if you disable all of them at once, then you get a significant regression.
They do a bit more than that. One of the options (-disable-object-based-analysis under AA2) disables the assumption that distinct identified objects do not alias, which is disabling pointer provenance in at least one key place.
So I think this option very roughly approximates a kind of "no provenance, but with address non-determinism" model, which still permits optimizations like SROA on non-escaping objects.
Fun fact: GCC decided to adopt Clang's (old) behavior at the same time Clang decided to adopt GCC's (old) behavior.
So now you have this matrix of behaviors:
* Old GCC: Initializes whole union.
* New GCC: Initializes first member only.
* Old Clang: Initializes first member only.
* New Clang: Initializes whole union.
And it shows a deeper problem, even though they are willing to align behavior between each other, they failed to communicate and discuss what would be the best approach. That's a bit tragic, IMO
I would argue the even deeper problem is that it's implementation defined. Should be in the spec and they should conform to the spec. That's why I'm so paranoid and zeroize things myself. Too much hassle to remember what is or isn't zero.
I wouldn't depend on that too much either though, or at least not depend on padding bytes being zeroed. The compiler is free to replace the memset call with code that only zeroes the struct members, but leaves junk in the padding bytes (and the same is true when copying/assigning a struct).
Since having multiple compilers is often touted as an advantage, how often do situations like what you're describing happen compared to the opposite — when a second compiler surfaces bugs in one's application or the other compiler?
Where possible, undefined behavior at the LLVM IR level is always optional, controlled by various flags, attributes and metadata. Things like signed integer overflow, or even the forward progress guarantee are controlled that way, and admit a wide range of possible language semantics.
Provenance (and to some degree the entire memory model) is the big exception to this. It's a very core part of the IR semantics, and an optimizing compiler without provenance would have to use entirely different methodology to justify many optimizations. I don't believe that anyone has yet been able to demonstrate the feasibility of highly optimizing compiler without provenance.
Huh, this is interesting. Normally the reason to become a CNA is to reduce the amount of bogus CVEs that are issued for your project due to security researchers trying to pad their portfolio.
Linux seems to have taken the reverse approach, by just filing their own bogus CVEs instead. One for every bug fix going into the kernel, rendering the CVE system useless.
> One for every bug fix going into the kernel, rendering the CVE system useless.
That is not what they're doing at all, things get CVEs by a small 3 people committee judging on whether it may reasonably have security impact.
If this is rendering CVEs useless to you, then you were misusing CVEs to begin with. CVEs are identifiers. The fact that an identifier is assigned does not mean anything about whether the security issue is real and/or its severity. Assigning an ID was meant to help discussing things, including determining whether it is in fact a security issue.
k so I'm writing a blog post on the whole #Linux Kernel #CVE/#CNA thing. And I actually looked at the data. For those of you complaining about the Linux Kernel issuing improper CVEs my response is "cool. If they're not security vulns get them rejected".
So far in 2024 the Linux Kernel error rate is 3.21%.
Is that bad or good?
Let's compare to the top 25 CNA's by error rate for 2024:
f5 49.32%
atlassian 44.44%
Esri 43.75%
freebsd 40.00%
canonical 32.61%
Gallagher 25.00%
SNPS 25.00%
intel 19.74%
Anolis 18.75%
Dragos 18.18%
rapid7 14.29%
@huntr_ai 12.27%
Google 10.00%
directcyber 8.33%
CERTVDE 8.11%
Go 7.69%
lenovo 6.25%
mitre 5.53%
schneider 4.35%
GitHub_P 4.35%
Fluid Attacks 4.35%
Wordfence 3.56%
Linux 3.21%
snyk 2.94%
So... Linux is in at 24th place for error rate. But wait, surely those numbers are skewed towards some smaller CNAs that reject a handful of issues driving up their error rate?
Nope. Several of the mature CNAs like F5, Atlassian, Canonical, Google, Intel, Red Hat, Lenovo, MITRE all issue tens to hundreds to thousands of CVEs a year and have much higher error rates. Actually the worst CNA by raw numbers is MITRE (159).
Spamming this multiple times since people don't seem to read.
For example, while Rust does have an LLVM fork, it just exists for tighter control over backports, not because there are any modifications to LLVM.