Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sure, there's the issue of what your contract says and what the guarantee is, but all these companies do already track their metrics in ways that at least attempt to detect and respond to the problems the author describes.

They track their metrics by p50 (the average performance/reliability for everyone) but also by p99, p99.9, etc., which is the performance/reliability for the extreme edge cases, such as exactly what the author is describing. They already do evaluate their systems from the perspective of how it's performing for the worst affected customers. Again, maybe the issue is the contract itself, sure, but they do already try their best to prevent a small handful of customers from getting overly affected by something.



I remember seeing a talk years ago about percentiles and how they lie: https://www.youtube.com/watch?v=lJ8ydIuPFeU

You should be exposing the maximum metric from your app, computing a percentile from an aggregated histogram is lossy.

[edit: Found the link, "How NOT to Measure Latency" by Gil Tene]


Here's the thing though. If I'm selling a product and I'm sending more than 10% of the money to a single vendor I have several problems.

If a vendor who can completely stop my operation has an outage, and the SLA says they owe me that 10% as a refund, I'm still having to deal with the 10x I'm losing because one of my vendors is having a bad day.

Those guarantees - if they even honor them, and if you can spare the time to chase them down - are still a quick road to bankruptcy.

So at the end of the day I probably have to raise my costs 10% in order to guarantee that no single vendor can drop me to 0%. And if those two vendors share a vendor, I may still be screwed.


Google loves to talk about billions of users. That is quite a few nines. Obviously there’s fewer users of cloud than search. But an engineer can only care about so many, before they need to save their sanity. Human attention is the one thing that’ll never scale.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: