We use Textual as an interactive way to explore compilation pipelines in xDSL [0]. As our compiler is written in Python, it was the perfect tool to build a UI in the same language as the existing codebase. After starting the `xdsl-gui` project, we found Marimo [1], a reactive notebook for Python, which also lets users build apps in Python. It's been interesting to compare these two, especially in the way they handle state and propagate updates. For now we're using both tools, but it might make sense to centralise at some point. Both of these frameworks like immutable data structures, which I find is a positive incentive to use immutability throughout the code, and has been good for the rest of the project.
Here's a keynote talk by Andrea Liu, the lead of the project, it's a much better resource about one of the most exciting things going on in ML right now:
This is basically my PhD thesis proposal, I don't think there's any fundamental technological problem here, just that for a graph to be efficient to process you need high-level optimisations that can take mathematical properties of graphs into account. For that you need to either reimplement a compiler into your framework, or be integrated into an existing compiler, both obviously take a lot of work.
Some comments here mention GraphBLAS, which is the big breakthrough in decoupling the layout of the graph from an efficient implementation of an algorithm, but none mention MLIR-GraphBLAS [0] which is the most promising integration into a compiler that I've seen.
I still think it's early days, I wouldn't throw in the towel quite yet :)
How will I have any expectations of run-time behavior if I have to hope that my graph will fuse or fail to fuse at run time?
Reminds me of the issues that haskell programmers face when an innocuous change causes list fusion to fail tanking performance; to know how to coax the compiler to fuse again you have to have intimate knowledge of how that fusion process works which isn't visible in the API; you need knowledge of compiler implementation/behavior.
The same could be said of a lot of things. For example in hash maps, you might have a cliff in performance if your hash function is not good for your data distribution, but you'll still happily use the default ones until you're really sure that they're not the right tool for the job. I also feel like this depends a lot on your language philosophy, some languages generally accept the cliffs, some try to expose enough of an API for you to work around the cliff if you feel like you know what you're doing, like custom allocators etc.
I have some personal hunches about how to have better guarantees about these properties but I feel like it's ok for this to not be solved with the v1.
I work on a reimplementation of MLIR in Python called xDSL, and ported a part of MLIR's Toy tutorial, you can try it interactively online at xdsl.dev/notebooks, or by cloning the github repo and running them locally github.com/xdslproject/xdsl
I'm fairly certain that this is false, and am working on proving it. In the cases that Numba is optimised for it's already faster than plausible C++ implementations of the same kernels.