Bit-Stealing Made Legal

greatfilter251 · on July 26, 2023

> In this article, we present a new approach to specify the data layout of rich data types based on a dual view: > a source type, providing a high-level description available in the rest of the code, along with a memory type, > providing full control over the memory layout.

Oh wow, I've been thinking along these exact lines for a few years; looks like I should have acted sooner! Not that I have any realistic route to be published, arxiv notwithstanding, so it's exciting to see the concept taken up institutionally, and in prestigious ones at that.

I have elsewhere called the concept "programmable pattern matching": when you declare an ADT constructor/variant, you are essentially simultaneously defining a function and a pattern. We already understand how to make functions with custom logic (duh) so the novel contribution is to give the pattern custom logic. And indeed, one possible implementation, which seems very different from the ideas pursued by the article, is to define a function whose signature matches that of the ADT's destructor a la dependently-typed languages (e.g. Lean), except the input is an integer/bitvector of some size instead of an element of the ADT.

To get a better sense of why this is a very exciting idea, check out the development of niche support in Rust enums over the years (e.g. sizeof::<Option<NonNullU8>>() == sizeof::<u8>(), sizeof::<Option<&T>>() == sizeof::<&T>()). The basic concept is there: we want to simultaneously enjoy ADTs which are the best type-safety/memory-safety invention since the invention of types themselves, while enabling every possible trick to save memory as cache lines (or, indeed, elements in the vector/SIMD register file) become a more and more valuable commodity for performance.

eamann · on July 26, 2023

Should be more clear this is a direct PDF download ...