As an ops person this is a super interesting question, so its really kindof surreal to read dozens of replies wherein not a single one mentions throughput, tail latency, or error rate measurements.
For near realtime systems that scale it is right up there with the fastest application servers. In fact, if you take the auto-scaling properties into account it probably beats those servers because it can do it seamlessly up to incredible number of requests / sec without missing a beat. If you want low latency you can replicate your offering in as many zones as you feel like.
People start worrying about throughput, latency and error rates when they become high enough (or low enough) to measure.
My personal biggest worry is that if your Google account should die for whatever reason your company and all its data goes with it. That's the one thing that I really do not like about all this cloud business, it feels very fragile from that point of view.
> In fact, if you take the auto-scaling properties into account it probably beats those servers because it can do it seamlessly up to incredible number of requests / sec without missing a beat.
Autoscaling is one of those things that's easy to name but hard to actually achieve. I've had some involvement with an autoscaler for a few months and it's been educational, to say the least.
In particular people tend to forget that autoscaling is about solving an economic problem: trading off the cost of latency against the cost of idleness. I call this "hugging the curve".
No given autoscaler can psychically guess your cost elasticity. Lambda and others square this circle by basically subsidising the runtime cost -- minute-scale TTLs over millisecond-scale billing. I'm not sure whether how long that will last. Probably they will keep the TTLs fixed but rely on Moore's Law to reduce their variable costs over time.