>The solution is to treat these two concepts separately. One subsystem should be...

nathanmarz · on Jan 11, 2024

Sounds like you've engineered a great way to manage complexity while using an RDBMS. A few things that Rama provides above this:

* Rama's indexing is much more flexible. For example, if you need to have a nested set with 100M elements, that's trivial. An index like that is common for a social graph (user ID -> set of follower IDs). If you need a time-series index split by granularity, that's equally as trivial (entity -> granularity -> time bucket -> stat).

* There are no restrictions on data types stored in Rama.

* Rama queries are exceptionally powerful. Real-time, on-demand, distributed queries across any or all of your indexes is trivial.

* Rama has deep and detailed telemetry across all aspects of an application built-in. This doesn't need to be separately built/managed.

* Deployment is also built-in. With your approach, an application update may span multiple systems – e.g. worker code, schema migrations – and this can be a non-trivial engineering task especially if you want zero downtime. Since Rama integrates computation and storage end-to-end, application launches, updates, and scaling are all just one-liners at the terminal.

* Rama is much more scalable.

This is looking at Rama from a feature point of view, and it's harder to express how much of a difference the lack of impedance mismatches makes when coding with Rama. That's something you learn through usage.

Rama is for the JVM, so any JVM language can be used with it. Currently we expose Java and Clojure APIs.

lifeisstillgood · on Jan 10, 2024

So does the “command” (say update customer address) perform the SQL and then some RDBMS trigger send the event onto rabbitMQ or is it an ORM that sends SQL and posts to rabbitMQ?

Plus where do you store events, in what format?

Tell me more please :-)

What you are missing is a cool name for the whole ecosystem

kgeist · on Jan 11, 2024

We explicitly generate events (in pseudocode):

   this.eventDispatcher.dispatch(new SomethingChanged())

It can be anything: just a property change ("AddressChanged"), or something more abstract (i.e. part of business logic), such as "UserBanned" for example.

Internally, the dispatcher serializes the event as JSON (because of custom event-specific payloads) and stores it into a special SQL table in the same local DB of the service as the original model. Both the original model and the event are committed as part of the same unit of work (transaction) -- so the operation is atomic (we guarantee that both the model and the event are always stored together atomically, and in case of a failure everything is rolled back altogether). It's called "transactional outbox". This step is required because just pushing to RabbitMQ directly does not guarantee atomicity of the operation and previously resulted in various nasty data corruption bugs.

Each service has a worker which finds new committed events in the local DB and publishes them to RabbitMQ (whose exchanges are globally visible to all services). Consumers (other services/apps) subscribe to integration events they are interested in and react to them. There are two types of handlers: some do actual business logic in response to events, others (like view services) just fill their view tables/indexes for faster retrieval from their own local DB, which solves some of the problems listed in the OP.

In our services, the order of processing events is very important (to avoid accumulation of errors), so in case an event handler crashes, it will be retried indefinitely until it succeeds (we can't skip a failed event because it would introduce inconsistency to the data because later events may expect data in a certain state). When a failed event is "stuck" in the queue (we're trying to re-process the same event over and over again), that requires on-call engineers to apply hotfixes. Due to the retries, we also have "at least once" delivery guarantee, so engineers must write handlers in an idempotent way (i.e. it must be safely retriable multiple times).

There's also an additional layer on top of RabbitMQ to support DB sharding and "fair" event dispatching. Our product is B2B so we shard by company (each company has its own set of users). Some companies are large (the largest is a popular fast food chain with 100k employees) so they produce a lot of events whose amount can dwarf the amount of events produced by smaller companies (50-100 employees). So we have a fair dispatcher which makes sure a large company which produces, say, 100k events, doesn't overtake the whole queue for itself (it splits events evenly between all accounts). This system also localizes the problem of stuck events to specific company accounts (the whole global queue is not affected).

So all in all, this is how it's done here, on top of MySQL/RabbitMQ and some glue code in Go.

lifeisstillgood · on Jan 11, 2024

This seems to have real legs - the materialised view is created first, there are real constraints on how to express an event (ie you have to write the SQL for the event, meaning the event for business must be expressible in terms of RDBMS right there - so 90% of event sourcing problems are faced up to right at the start, and it’s “just” a layer on top of RDBMS

(What I am trying to say is that I have seen things like Event UserAccountRestartedForMarketingSpecial and there is some vague idea that 9 listeners will co-ordinate atomic transactions on 5 MQ channels and …

But this forces people to say this event must be expressed in this SQL against this dbase- if not then there is a mismatch between the SQL model and the business model. You probably need to split things up. I suspect there is some concept like “normalisation boundaries” here - single events cannot transcend normalised boundaries.

Anyway I like this - POV is worth 80 IQ points.

Still needs a cool name