It's funny, while I was reading the "before" architecture I found myself thinking "my god, why don't they just move to a cloud already? this would be much simpler if they did". I "turned the page" and there was GCP :D.
Also, I wonder why they went with GCP instead of AWS. Does Twitter have a deal with Google that I'm not aware of?
It appears to me they left a lot of the real time processing in house, and the pub / sub to web clients and processing of that data to GCP where I'm guessing the rest of their web delivery stack is. I think they're wise to keep the real time processing in house (eg: kafka streams, etc)
agreed! the real-time processing should be bespoke and is where I think Twitter previously shined. The data processing wasn't a point of differentiation for them and so it makes sense for them to offload it to a cloud provider and let someone else deal with the operations associated with it.