As someone who works for Government and Enterprise - all I care about sometimes is how a company behaves when everything goes wrong.
The issue with outages for the Government organizations I have dealt with is rarely the outage itself - but strong communication about what is occurring and realistic approximate ETAs, or options around mitigation.
Being able to tell the Directors/Senior managers that issues have been "escalated" and providing regular updates are critical.
If all I could say was a "support ticket" was logged, and we are waiting on a reply (hours later) - I guarantee the conversation after the outage is going to be about moving to another solution provider with strong SLAs.
Very similar thing at our office. Considering the scale of which we run things, any outage could be a potential loss of millions _every minute_.
Sure, we use support tickets with vendors for small things. Console button bugging out, etc. But for large incidents, every vendor has a representative within an hour driving distance and will be called into a room with our engineers to fix the problem. This kind of outage, with zero communication, means the dropping of a contract.
Communication is critical for trust, especially if we're running a business off it.
Going single cloud on that scale is simply irresponsible though.
You need failovers to different providers and hopefully also have your hardware for general workloads
And suddenly the CEO doesn't care anymore if one of your potential failovers is behaving flaky in specific circumstances
Not saying it's good as it is.. communication as a saas provider is - as you said- one is the most important things... But this specific issue was not as bad as some people insinuate in this thread
You are incorrect about aws. If your pay for business support, and something is happening to your production environment, they are on a call with you in less than an hour.
The issue with outages for the Government organizations I have dealt with is rarely the outage itself - but strong communication about what is occurring and realistic approximate ETAs, or options around mitigation.
Being able to tell the Directors/Senior managers that issues have been "escalated" and providing regular updates are critical.
If all I could say was a "support ticket" was logged, and we are waiting on a reply (hours later) - I guarantee the conversation after the outage is going to be about moving to another solution provider with strong SLAs.