Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>The solution is to treat these two concepts separately. One subsystem should be used for representing the source of truth, and another should be used for materializing any number of indexed stores off of that source of truth. Once again, this is event sourcing plus materialized views.

At work we decouple the read model from the write model: the write model ("source of truth") consists of traditional relational domain models with invariants/costraints and all (which, I think, is not difficult to reason about for most devs who are already used to ORM's), and almost every command also produces an event which is published to the shared domain event queue(s). The read model(s) are constructed by workers consuming events and building views however they fit (and they can be rebuilt, too). For example, we have a service which manages users ("source of truth" service), and another service is just a view service (to show a complex UI) which builds its own read model/index based on the events of the user service (and other services). Without it, we'd have tons of joins or slow cross-service API calls.

Technically we can replay events (in fact, we accidentally once did it due to a bug in our platform code when we started replaying ALL events for the last 3 years) but I don't think we ever really needed it. Sometimes we need to rebuild views due to bugs, but we usually do it programmatically in an ad hoc manner (special scripts, or a SQL migration). I don't know how our architecture is properly called (I never heard anyone call it "event sourcing").

It's just good old MySQL + RabbitMQ and a bit of glue on top (although not super-trivial to do properly I admit: things like transactional outboxes, at least once delivery guarantee, eventual consistency, maintaining correct event processing order, event data batching, DB management, what to do if an event handler crashes? etc.) So I wonder, what we're missing without Rama with this setup, what problems it solves and how (from the list above) provided we already have our battle-tested setup and it's language-agnostic (we have producers/consumers both in PHP and Go) while Rama seems to be more geared towards Java.



Sounds like you've engineered a great way to manage complexity while using an RDBMS. A few things that Rama provides above this:

* Rama's indexing is much more flexible. For example, if you need to have a nested set with 100M elements, that's trivial. An index like that is common for a social graph (user ID -> set of follower IDs). If you need a time-series index split by granularity, that's equally as trivial (entity -> granularity -> time bucket -> stat).

* There are no restrictions on data types stored in Rama.

* Rama queries are exceptionally powerful. Real-time, on-demand, distributed queries across any or all of your indexes is trivial.

* Rama has deep and detailed telemetry across all aspects of an application built-in. This doesn't need to be separately built/managed.

* Deployment is also built-in. With your approach, an application update may span multiple systems – e.g. worker code, schema migrations – and this can be a non-trivial engineering task especially if you want zero downtime. Since Rama integrates computation and storage end-to-end, application launches, updates, and scaling are all just one-liners at the terminal.

* Rama is much more scalable.

This is looking at Rama from a feature point of view, and it's harder to express how much of a difference the lack of impedance mismatches makes when coding with Rama. That's something you learn through usage.

Rama is for the JVM, so any JVM language can be used with it. Currently we expose Java and Clojure APIs.


So does the “command” (say update customer address) perform the SQL and then some RDBMS trigger send the event onto rabbitMQ or is it an ORM that sends SQL and posts to rabbitMQ?

Plus where do you store events, in what format?

Tell me more please :-)

What you are missing is a cool name for the whole ecosystem


We explicitly generate events (in pseudocode):

   this.eventDispatcher.dispatch(new SomethingChanged())
It can be anything: just a property change ("AddressChanged"), or something more abstract (i.e. part of business logic), such as "UserBanned" for example.

Internally, the dispatcher serializes the event as JSON (because of custom event-specific payloads) and stores it into a special SQL table in the same local DB of the service as the original model. Both the original model and the event are committed as part of the same unit of work (transaction) -- so the operation is atomic (we guarantee that both the model and the event are always stored together atomically, and in case of a failure everything is rolled back altogether). It's called "transactional outbox". This step is required because just pushing to RabbitMQ directly does not guarantee atomicity of the operation and previously resulted in various nasty data corruption bugs.

Each service has a worker which finds new committed events in the local DB and publishes them to RabbitMQ (whose exchanges are globally visible to all services). Consumers (other services/apps) subscribe to integration events they are interested in and react to them. There are two types of handlers: some do actual business logic in response to events, others (like view services) just fill their view tables/indexes for faster retrieval from their own local DB, which solves some of the problems listed in the OP.

In our services, the order of processing events is very important (to avoid accumulation of errors), so in case an event handler crashes, it will be retried indefinitely until it succeeds (we can't skip a failed event because it would introduce inconsistency to the data because later events may expect data in a certain state). When a failed event is "stuck" in the queue (we're trying to re-process the same event over and over again), that requires on-call engineers to apply hotfixes. Due to the retries, we also have "at least once" delivery guarantee, so engineers must write handlers in an idempotent way (i.e. it must be safely retriable multiple times).

There's also an additional layer on top of RabbitMQ to support DB sharding and "fair" event dispatching. Our product is B2B so we shard by company (each company has its own set of users). Some companies are large (the largest is a popular fast food chain with 100k employees) so they produce a lot of events whose amount can dwarf the amount of events produced by smaller companies (50-100 employees). So we have a fair dispatcher which makes sure a large company which produces, say, 100k events, doesn't overtake the whole queue for itself (it splits events evenly between all accounts). This system also localizes the problem of stuck events to specific company accounts (the whole global queue is not affected).

So all in all, this is how it's done here, on top of MySQL/RabbitMQ and some glue code in Go.


This seems to have real legs - the materialised view is created first, there are real constraints on how to express an event (ie you have to write the SQL for the event, meaning the event for business must be expressible in terms of RDBMS right there - so 90% of event sourcing problems are faced up to right at the start, and it’s “just” a layer on top of RDBMS

(What I am trying to say is that I have seen things like Event UserAccountRestartedForMarketingSpecial and there is some vague idea that 9 listeners will co-ordinate atomic transactions on 5 MQ channels and …

But this forces people to say this event must be expressed in this SQL against this dbase- if not then there is a mismatch between the SQL model and the business model. You probably need to split things up. I suspect there is some concept like “normalisation boundaries” here - single events cannot transcend normalised boundaries.

Anyway I like this - POV is worth 80 IQ points.

Still needs a cool name




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: