Hi, I'm the developer of crdt-richtext. I'm humbly grateful for everyone's support and quite surprised to see the level of interest this project has generated. It's still just an experimental first step for us and not mature for production yet. As some of the comments have pointed out, performance may not be the main bottleneck for adopting CRDTs anymore. The potential bottlenecks in the user path might lie in the DX and the integration with existing tool stacks. Our primary project, Loro, which is still in closed development, aims to address these issues in the near future. We look forward to sharing updates with you as we progress.
I'm happy to see even further performance improvements towards rich-text CRDTs but at this point, I think the barrier to adoption isn't speed or compactness but instead integration with existing backends and databases.
I have a hunch that we're reinventing the wheel by creating new B-Tree implementations in Rust when we could be figuring out how to make an existing database do the hard work of storing and retrieving characters in the correct order. I know Martin Kleppmann has looked at Datalog being a potential solution to this problem but until we have a good full-stack solution to collaborative text editing, I don't think we'll see major adoption of these CRDT solutions.
I agree that full stack support is the missing pice that's need to make use explode. But I do think the current implementations will get there.
CRDTs are fairly unique in that you need them to be exposed very close to the front of your stack in order to capture user intent, but you also need support further back in your stack for merging, replication, and querying.
We have good front end support and there are multiple exciting projects building collaboration servers (PartyKit in particular looks awesome) for them. What I think is missing is support in database, that's what I've been experimenting with (below). If you are building an offline enabled app, having the ability to generate diffs and merge in database enables easy multi document sync.
Also most general purpose CRDTs are a combination of JSON and XML like data structures, it's useful to be able to query the structures in your database. For example if you build a notes app that supports inline tags, it's useful to be able to query and index those from within the XML like structure without having to dump the whole thing out at another layer of your stack.
Essentially, you need to capture user intent in the CRDT on the front end, but then have support through your whole backend.
But, in general, I think searching in and across documents and database storage is going to a thorny problem even with existing, hyper-optimized CRDT algorithms. For example, if you just store your Yjs document as a binary blob in a Postgres column, Postgres becomes the bottleneck and all the fancy range-tree or b-tree optimizations are no longer that helpful on the server. Plus, it'd be great to have an document edit just be a tiny insert into the database that can be streamed to other clients rather than a big row update.
This may because the focus of CRDTs seems to be rooted in a fully decentralized, P2P system but many developers want collaborative text editing in a more traditional client-server model.
Hey Matt, thanks. I will drop my thoughts into that thread tomorrow. Cr-SQLite is awesome, I found that thread a few weeks ago and actually had it on my list to get in touch.
> we're reinventing the wheel by creating new B-Tree implementations in Rust when we could be figuring out how to make an existing database do the hard work of storing and retrieving characters in the correct order
That is liable to produce a significantly less efficient result in both computational complexity and storage requirements. The b-tree implementations used in existing databases are optimised for managing data in fixed(ish) size pages which I don't think matches the CRDT use case.
Heck, there are projects looking at whether current implementations in DBs which were designed when spinning disks were the best online storage we had (or all but the very monied could afford in significant size) are sub-optimal enough in our current RAM-is-relatively-cheap and random-access-IO-is-massively-less-latent-than-it-used-to-be world that the compatibility hump might be worth making significant changes to pages sizes and other considerations in those structures for even the DB work they are designed for.
Yeah, I think a good step would be to integrate a rich CRDT into sqlite, whether that's through CTEs, a custom extension, or library code that manages the tables.
Layering a CRDT on top of an existing schema might not be very efficient though. You'd likely need to modify the db's internals. I have no evidence for this statement though :)
Diamond types author here! Congratulations on getting your crdt working! It’s lovely to see a new generation of CRDTs which have decent performance.
And nice stuff implementing peritext! I’d love to do the same in diamond types at some point. You beat me to it!
Im building a little repository of real world collaborative editing traces to use when benchmarking, comparing and optimising text based CRDTs[1]. The automerge-perf editing trace isn’t enough on its own. And we’re increasingly converging on a format for multi user concurrent editing traces too[2]. It’d be great to add some rich text editing traces in the mix if you’re interested in recording something, so we can also compare how peritext performs in different systems.
Anyway, welcome to the community! Love to have more implementations around!
Hi! I really like your work, and I've learned a lot from your code ;)
I totally agree that we need more real-world datasets to make better benchmarks and avoid 'overfitting'. It'd be nice to have benchmarks for rich text.
I've also noticed that certain scenarios, like drawing, table editing, and whiteboard editing, aren't well covered. What are your thoughts on this?
Thanks!! I was a bit disappointed diamond types isn't in the table of benchmarking results. Mind if I send you a PR adding it?
I also think it'd be nice to have a native rust benchmarking suite for this stuff. All of the CRDT libraries are either in rust or are moving to rust. Some benchmarks with criterion would be nice to see.
> I've also noticed that certain scenarios, like drawing, table editing, and whiteboard editing, aren't well covered. What are your thoughts on this?
Thats been on my todo list for awhile. I love performance optimization but you can't optimize what you can't measure. I'm just not sure where to get editing traces at the moment given not much software works like this. We could reverse engineer figma enough to record editing traces out of it (or use its plugin API?). Or record some traces from tldraw. Ideally we'd have a few traces from different applications - since I wouldn't be surprised if different apps depend on this stuff differently.
Are you intending to add JSON-style editing support to your CRDT?
If anyone reading this is working on a collaborative app, and wants your app to be exceptionally well supported by CRDTs - please get in touch and we can talk about getting some editing traces. We can't properly optimize collaborative editing systems without them!
PRs are welcome! I wasn't aware that Diamond types also provided a WASM version. I've included it in the native benchmark with Loro though. You can see it here: https://www.loro.dev/docs/performance/native
> Are you intending to add JSON-style editing support to your CRDT?
Loro is another CRDT library I'm working on. It's a superset of crdt-richtext and supports JSON-style editing.
> We can't properly optimize collaborative editing systems without them!
Really interested to see how this turns out. In addition to solving the number of problems that CRDTs solve when it comes to text, I would like to see examples of integrating a CRDT system that doesn't solely have text as its stored asset.
For example, say you're storing events alongside your CRDT datastructure (the text). Assuming the events can be blindly merged, what does a P2P sync look like? If it's something that's handled by the CRDT framework (as it looks like Loro will do), do you have to maintain two different sync processes?
Simple CRDTs are surprisingly simple. If you don't need text the complexity drops off quite quickly.
I just ripped yjs out of a personal project and replaced it with something I wrote from scratch. Getting exactly the semantics I want is nice. For example it's easier to make compound properties update atomically. A good place to get started is one of the figma founders' blogs: https://madebyevan.com/algos/crdt-fractional-indexing/
You'll notice if you use Figma that it's mostly last-writer-wins fields. That seems to often turn out to be the ideal UI and it's the simplest to implement. If you the text to green and I set it to red the right answer isn't yellow.
Sure, there's plenty of complexity there too. For simplicity, I was just assuming timestamps are fine, and we're not concerned about apparent concurrent events or incorrect timestamps from a host.
Your crdt library should be maintaining some kind of logical timestamp. I'd put that in your event and use it to order. That way you get a consistent view of time.
(For example yjs calls this a state vector. Might also be called a Lamport timestamp or a vector clock.)
Naively, literally logs/timeseries events (i.e. stuff that can literally be inserted and queried in order of their timestamp).
This scenario can be made increasingly complex as you add user interaction, such as a settings JSON blob that you probably do want to intelligently sync.
----
In general, particularly for the timeseries case, I just want to know what the intended P2P flow looks like for different datatypes, including those that probably aren't managed by the CRDT algorithms. It seems like these systems beeline for the hard text case (which makes sense, because it's the interesting and challenging problem), but forget that in the real world, you need to sync more than just text.
I'm hoping to find a literature review that situates CRDT text algorithms along with text differencing and source control. Are there some key "contours" where certain algorithms shine with regards to (a) number of concurrent edits; (b) length of source text; (c) length of changes; (d) time difference (is this a reasonable metric?) between synchronization; (e) reconciliation / merging / verification.
Apologies if this sounds like a professor without funding asking for free work. I can assure you I am not a professor, but it is true that I have no funding.
This looks super interesting, always excited for a new CRDT implementation.
I'm slightly confused by the benchmarks, they look incredible, somewhat surprised by its performance over Yjs/Yrs and Automerge. Going to have to dig in more to understand.
Has anybody done any work or research integrating CRDT-ish text conflict resolution with a database transaction process? E.g. in MVCC when detection of version conflict (e.g. via timestamps) happens, the approach is generally to abort and retry; but if the conflict in the tuple/row is at textual level and can be resolved by this kind of algorithm, it seems this could handle some conflict resolution without full retry or abort?
I am working toward that goal of integrating text diffing or CRDT diffing into a database.
I wanted extremely high performance multimaster scalability of writes with linear scalability and semiautomatic text diffing and merging. I think this calls for an eventually consistent architecture.
The idea is to implement the GUI so that diffs are exposed to the user if there is a conflict but otherwise use the Myers algorithm.
Not quite the answer to your question, but there are quite a few people looking at doing eventually consistent database syncing with CRDTs. Two interesting ones are:
I've been experimenting with Yjs and merging documents in database via SQL, for both SQLite and Postgres. And allowing you to query the contents of those documents, similar to how you can query JSON. Early days, these are just tests:
I hadn't heard of Electric-SQL before. Was potentially interested, but the lack of native language bindings is a dealbreaker. I don't particularly like a centralized server being required for the sync either, but it's ok.
VLCN is very interesting to me, and I really need to play around with it. It's not incredibly obvious to me how these extensions to SQLite play out in actual use.
> Peritext is a novel rich-text CRDT algorithm. It can merge concurrent edits of rich text without losing the intended edits.
The page lost some foundational credibility with me by making this claim without immediately following it [1] with citations, clarifications, or caveats.
Why? Because "intention" is a very big tent. At the very least, the project would benefit by characterizing the mapping between intention and edits.
Perhaps this project's notion of intention could be summarized very compactly. Let's see what we find and share it here?
P.S. I'm feeling like philosophy, which clarifies so many concepts -- or at least explodes them by asking a bunch of specific questions -- is more and more relevant when it comes to designing computational systems. The idea of intention is so key to so many human interactions. It would be a travesty for computer scientists and software engineers to gloss over its impacts.
In reading your comment, I don’t see the connection to my concerns about intent. This leaves open the question of “is it worth clicking?”
In this case it was, so thank you. Here is the quote:
> Before we can implement an algorithm to perform this automatic merge, we first need to define a clear model for user intent, and define the expected results when concurrent edits are merged. In this section we develop such a model through a series of examples.
I only go to the trouble of mentioning this to help remind people that it is a win-win to make a clear connection when commenting. In this case, simply saying “This link talks about intent in detail” would have worked.
In this case, and I'm not joking, my wife called me through as she had just put dinner was on the table, clearly that took priority. So I hit send before adding any more context or comment, which I would otherwise have done.
I'm pleased you clicked, that page is such a brilliant overview of CRDTs, rich text editing with them and capturing intent.
The specific definition of user intent in the context of concurrent rich text editing can't be clearly explained in a few words. it's best understood through specific examples. I've provided some examples in my brief introduction to Peritext.
There's a link to everything in the blog post, including crates.io in the sentence it talks about the crate the post is about, but no link to the crate itself :/
So both Yjs and Automerge support rich text types and JSON like structures. They also have bindings to various rich text editor frameworks such a ProseMirror. It's completely possible to build an editor that combines rich text and typographic tables.
Your reference to "structured data" however makes me thing your referring to data tables. Again with both Yjs and Automerge these can be represented as just a list of maps. So you have one CRDT document that combines a rich text section and the data section, or even nests them.
It would even be possible with Prosemirror, for example, to build a rich text and spreadsheet like editor backed by CRDTs.
Their site points to another repo that gives a 404 and the you linked to says its a subset of the larger project. It says the larger project isn't open source yet, which I guess is the answer.