Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Southwest Airlines grounds its entire fleet amid giant computer outage (nypost.com)
211 points by tosh on June 15, 2021 | hide | past | favorite | 176 comments


Got stuck in Denver yesterday thanks to this. Flight was delayed for 2.5 hours, airport was insanely crowded and the poor lady at the Southwest desk had a never ending stream of people asking questions she didn't have answers to. The airport bar was certainly making a heck of a lot of sales, though!

On a serious note, it seems somewhat of an oversight that Southwest's entire multibillion dollar operation can be ground to a halt due to the weather pamphlet printing system breaking. Talk about single points of failure!


Is the weather pamphlet system owned by the airport bar business?


synergy


I imagine checking the weather is an important task for pilots.


You tend to want redundancy for important tasks.


On the other hand airlines work on very thin margins, controlling costs is important.


Airlines really do work on the thinnest or margins. I've seen a few Youtube videos on the topic but it's basically why you see zero startup airlines, it's basically impossible. The immense startup cost and operations cost coupled with the fact that your competitors are 100x bigger and can crush you like a bug.


I'm going to call BS on that... Ignoring 2020 and looking at their 2019 report (that was released in Jan 2020) we see they had their 47th consecutive year of profit [0]. They are making plenty of money and margin doesn't factor into it when they are pulling down massive profits like this.

[0] https://www.southwestairlinesinvestorrelations.com/news-and-...


And Southwest is one of the most profitable airliners too. They use huge planes between major hubs and stack you in like sardines only mildly more comfortable than Spirit.

I believe they were also the most profitable US airline last year because of how streamlined their entire operation is. They could afford it if they wanted it.


I wouldn’t consider 737s huge planes.

Also, their entire fleet is made up of 737s, so it’s not like large airports get a massive planes (maybe a MAX vs a -700, but still not a huge difference)


Of course margin factors into it. It always does. Margin accounts for all of that profit.

Looks like their net margin was about 10%. Which isn't super low. But it's no 30% plus like many of the tech giants.


Security is expensive. Redundancy also. Unfortunately they have great savings potential.


Indeed. I feel like many companies have their IT systems in varying levels of disrepair. Just look at the sheer amount of data breaches. At least this wasn’t a security breach (though I’d imagine (hope?) that it’d wouldn’t impact aircraft in flight.


I've had a terrible time because of this. Nothing seems to work well when stressed.

I had a flight with one layover. At the airport my first flight was delayed an hour, making the connection unlikely. I asked the customer service what I could do. Most alternative fights were booked, so I took a flight the next day with one very long layover. They said that because this issue was out of their control they were not obliged to provided a hotel room or anything else. I called the travel agency from my work and booked a flight from my original layover destination with a different airline. I then talked to customer service and changed my booking back to the original itinerary, thinking that the second flight may be delayed as well. Then the first flight was delayed by another hour, making the connection impossible. Thankfully I had a confirmation number with the alternate airline for the second leg of the trip. I flew to the layover destination. In flight I checked the Southwest schedule and saw the original connecting flight had been cancelled. When I landed I went to the other airline customer service desk to print my boarding pass. They said my ticket had been cancelled. I called the travel agency and they said that the airline was trying to avoid overbooking and was not letting any third parties buy tickets. There was another flight available with another airline, but the same thing: they wouldn't let the travel agency buy the ticket. I went back to the customer service desk and bought the ticket myself without any problems other than having overpaid for a flight for reasons beyond my control.

I'm currently sitting here, and will be for the next 7 hours waiting to go home.

There was a man with schizophrenia shouting earlier about needing to get home and not having his medication, and many minutes later I heard him screaming in the distance. Later there was another man shouting about masks. I also met a woman who was with her family trying to get to New Orleans, another couple going my way without a clue how they were going to get there, an old man and woman also going to New Orleans who were told their 30 minute layover (before the second delay) wasn't going to be a problem, and a young man who took it all pretty well and just took the next day rebooking and left to go find a hotel.


I don’t understand how they can say the issue is out of their control. They choose the weather software and could have redundancies in place? If a plane has a fault, then they’d have to put you up for the night.

Someone at Southwest must have looked at how much this was going to cost and panicked.


I don't believe the customer service people knew what had happened. They were chatting about how there was a "service disruption" and that it was from "American Airline's system" not theirs so it was "out of their control". Maybe they were just trying to get the blame off of them, but they definitely weren't telling people flights were being cancelled. I really think they were in the dark and I speculate (tell me if I'm wrong) the default response is to admit no fault until someone higher up says to.

What I really don't like is that I didn't receive any notification about any of this. No email or text or phone call from Southwest or the travel agency. Had I checked HN I would have found out before returning the rental car.


You're probably right about the default response. The gate agent somewhere like Southwest is more likely to get in trouble for giving away too much free stuff than too little.

I already check HN when a big web service is down. The time has come where we should check HN if our airline/train/public service is down too!


Doesn't matter the delay the desk will aways try to get out of it being their fault to save on hotel costs, etc.

I loath it but this is one of the few circumstances where you should go full assertive. Don't take it out on the desk staff, they're having a far worse day than you, but stand there and make clear you are a problem that isn't going away until there's some resolution. Suddenly the hotel voucher will appear.

I want the whole system reformed. I've had to do this with United several times now and hate it so much I'll quite literally spend $500 more just to avoid them.


Everyone looks at me like I'm insane for driving myself long distances... I never have to deal with this bullshit, though. Insane, it seems, to take your life in your hands in this way.


I found this funny because I did very briefly consider driving to the layover airport. Agency is a great thing to have, and I think it's much more important for a happy life than efficiency, but there are limits. My limit is apparently somewhere under 7 hours (the time for the drive to the first airport).


Driving is statistically FAR more dangerous than flying though.

Think about it, you are hurtling along a highway at 70 mph in a car with numerous components that are aging and can break catastrophically. All it takes is a tire blow-out and your car flips. Or the overweight truck driver in the lane behind you has a heart attack and his truck crushes your car like a soda can. Or a vehicle traveling 70 mph in the opposing lane loses control and flies across the median, hitting your car and killing you instantly before you even have time to react.

Meanwhile, the airplane covers that same distance in a fraction of the time and has not one, but two pilots, with numerous years of experience. If one of the pilots has a heart attack, the other one can take over. There are almost no vehicles on the road with that level of redundancy.


Because this doesn't have to be the general experience?

I often think about how my air-travel experience is different and stress free from others.

As someone that doesn't drive to the airport and can basically just walk through empty security lines, only books direct flights and doesn't have to wait till the holiday to travel, the stress is non-existent.

The worst common thing I encounter is my plane landing early at the destination and having to wait on the tarmac longer because the gate isn't clear yet.

In comparison, many others are driving to long term parking, taking long shuttles, have packed inefficiently resulting in long lines, rebalancing at both the baggage check area and at security, and have to compound the flight promptness gambles with short or long layovers. Can't tell which one is worse. Now that I think about it, many people likely don't have a way to get their ticket available easily either. The stress compounds on itself.

Did I mention the airport lounges? The discovery process can be nice.


Every time I fly, it feels like half the people in the lines have never flown before and their inexperience is compounded by their ignorance of posted signs, call-outs, and general awareness of what other people are doing.

Like...when you're in the security line, you should have your ID and ticket ready to show the TSA agent. I've seen way too many people get to that checkpoint and only then put their bag down and ruffle through it to find their wallet and ticket, completely oblivious to everyone that was in front of them in line and already had that stuff ready.

Then, when they have to put their stuff on the scanner...doesn't matter how many times the agents shout that laptops have to be removed from bags and their shoes have to be taken off, they'll try to go through with shoes on and leave their laptop in the bag.


First I am 90% sure your experience is not the "general experience".

Aside from my fundamental problem with security theater at the airports, the actual experience in the airport is not terribly bad but I have been stranded in places due to flight cancellations, I have experienced long delays, etc

Also I am a Big AND Tall person, for which the airplane itself I believe has been designed to purposely torture Big AND tall people, anything over a 2 hr flight and my knee,s and legs are in serious pain, the worst is then having to sit for 45mins at the other end "waiting for a gate"

Further given my location and my destination the majority of my flights are on small regional jets for which there is no "business class" to get a larger seat or more leg room, and "exit row" in these small jets is laughable

No, when ever possible I drive


Depends on the distance.

I'm in in the Portland, OR area. I would absolutely drive to Seattle (~180 miles) or San Francisco (~630 miles). LA (~1,000 miles) is the limit for how far I'm willing to drive.

But when I take a trip to theme parks on the east side of the country (Cedar Point, King's Island, and Holiday World were my last destinations), that's over 2,000 miles. That's 3 days of driving at a reasonable pace. I'm definitely flying that one.


thank you for validating my position that I will not be flying anywhere in 2021. if I need to travel it better be in driving distance or postponed until 2022...

no Air Travel for me this year


Yeah, from what I understand of flight operations "inability to transmit weather information" via their in-house computer system shouldn't be a showstopper for thousands of flights nationwide. It wasn't so long ago that pilots weather info was just a print out.

I'm not a commercial pilot but would love if someone who is could post a plausible hypothesis how this would ground the whole fleet (short of a simple bureaucratic decision to shut down if everything isn't working exactly normally).


Each commercial flight in the US is required to file a flight plan before takeoff. This flight plan is created and approved by an FAA licensed dispatcher, who then tracks the progress of the flight and develops alternative routing as necessary.

An important part of the flight plan is weather conditions along the route. There are obvious safety concerns, but weather conditions are also responsible for the route chosen, the amount of fuel onboard the plane, and passenger comfort. Without weather information, the flight plan cannot be filed and the plane cannot fly. Keep in mind that there are specific regulatory standards that must be met with any information included in the flight plan. A pilot cannot simply open up the weather app on their phone for instance. The weather info needs to come from an approved source.

I have no specific information about Southwest's current issue, but my guess is the third party service they use for weather in the flight plan is out of service. If they can't transmit this info to aircraft in the air or on the ground, then nobody gets to fly today. I'm also guessing that an incident like this will result in backup solutions being put into place, and that there is an IT or systems engineer somewhere who is trying to hide their "I told you so" expression.

Edit: an example flight plan can be found here: https://www.simbrief.com/system/guide.php#ofpsample. This is from a site for sim pilots, but the format of the brief is the same as what a commercial airline would file for a flight.


I'll add to this (as a private pilot) - there are also a lot of requirements of weather at your destination; in terms of wind, visibility, temperatures and how long those forecasts are valid for and so forth.

In order to dispatch a flight you need to ensure that the weather at the arrival airport meets certain minima at that point in the future (hence the need for forecasts), and so does that at suitable alternative airports, and enough fuel is carried to be able to divert (depending on a heap of other factors such as aircraft load, capacity, prevailing winds enroute, at destination, time of day and so forth).

Aircraft are then also required to check the AWIS/ATIS or the Tower (usually the most reliable source of weather at an airfield) to ensure that the weather at the destination is within their safe margins.

There's the FAA requirements, and then airlines and aircraft and airports then also have their own established minimum requirements that all need to be within the safe range.

You can certainly dispatch a flight with potential updates and diversions as they progress / get closer to the destination, but with an outage and no means to update the enroute aircraft - there's usually no easy safe or legal way to dispatch them.


Thanks, that is certainly a plausible explanation - basically not being able to sufficiently guarantee compliance with some approved process that had a single point of failure somewhere.

I also agree that, much like when the Space Shuttle blew up, there were engineers somewhere who immediately suspected exactly how and why.


Like the space shuttle o-ring, I at first thought this was announcing 737MAX crashes. That new Boeing design to fit the larger engines should have a new type certification, instead of software which creates a pretend appearance of older 737s.


I agree with everything you wrote except "trying to hide". In these situations, it's impossible to hide "I told you so". :)

Also, the nice thing about wfh is the possibility of screaming about the stupidity of your organization more conveniently...


So if a plane is in flight (and I'm sure some were), and the weather system goes down, then what? Stick with the previous plan? Divert?


In flight weather can be observed (most airliners have weather radar onboard) and the pilots can check on weather along their route with controllers and at the destination or in specific places by electronic inquiry.


In-flight weather by radio is an option but it’s a manual thing. And the airline may not be set up to use that except an emergency situation.


It is a requirement - AWIS/ATIS and ATC/Tower (all via radio) can provide weather information at airfields, and a lot of instrument approaches do require certain readings before an approach can be commenced.


There is also the PIREPS (Pilot Reports) system where pilots report actual weather conditions as encountered by an aircraft in flight. Some aircraft have automated weather reporting systems (AMDAR).

https://www.skybrary.aero/index.php/Pilot_Report_(PIREP)


I don't think the equipment to receive weather updates is on the Minimum Equipment List, but if dispatch couldn't even update over the radio then I would imagine they would start diverting flights. This is just a guess, hopefully someone with more knowledge can jump in here.


It depends on the airframe - but I believe for Airline Transport category aircraft a storm scope and a few others are required.


I was on one of these flights, and the pilot told us that they actually had the weather information available to them on their iPad. He said the systems that failed were the ones actually creating the physical printout, which the FAA required for takeoff.


That actually makes a fair bit of sense. The FAA probably wouldn't have "safety rated" a retail iPad as meeting the primary requirement. The iPad is considered the backup but probably what they use day to day and the printout is technically the primary. So, no printout, no takeoff.


> It wasn't so long ago that pilots weather info was just a print out.

It also wasn't so long ago that scheduled domestic US airliners crashed on a fairly regular basis. That doesn't happen anymore because of advances in technology, this just being one little piece of that.


Yeah, people sometimes don't seem to understand that a lot of rules are written in blood.


Pure speculation: perhaps the FAA requirements on weather updates has changed since paper weather reports were standard?

edit: s/FFA/FAA/


I never understand people's expectations of companies during stuff like this.

There are people complaining that they can't get through to customer service on their phone. Like, do they expect them to always have enough customer service capacity for when every single customer in the world has reason to call them? That'd be insane.

Flight cancelations are always a risk when flying. You need to always expect that you'll potentially have to be stuck somewhere for a night or two. If the airport floor doesn't suffice for you, fly airlines that have a track record of giving hotels for cancelations and have insurance/funds to cover a few nights if you aren't able to get a room from the airline.

This kind of stuff is unavoidable and just part of air travel. Be prepared, be patient.


When you've travelled 30 times in a row with near-perfect QoS, your expectations get recalibrated. It's an ironic effect of success that your customers become less tolerant of failure.

It's the Louis C.K. bit about being (roughly) 'in a chair in the sky at 500mph: it's a miracle yet no one's happy'.


I chuckle a bit when people complain about the slow satellite based wifi. So, you're in a metal tube, miles in the air, going 500mph, getting internet from a satellite 20,000 miles away. And...the wifi is a bit slow?


We have the technology!

Louis is right about You're flyiinngg! but also flying is normal now, so we always hope that things get better. Rich people don't even fly anymore, they sleep in a luxury hotel in the sky. Things are possible, it's just that they're not cheap yet.


I do think LEO satellites will fix this, but it takes a long time to test, get regulatory approvals, do installs, etc. By the time new tech rolls out on aircraft, it's not really new anymore.


We've already solved it. There's a service on some airline that I forget that lets you even stream videos (Netflix, Youtube, etc) from the plane without issue. It was absolutely amazing.

Gogo, the one that absolutely sucks, locked many airlines into a 10 year contract so they're screwed until that ends. That's the only thing stopping the fast service(s) from rolling out.


Guessing you used a Ka band system. It's 70mbps in ideal conditions, but probably more often 30-50mbps in real life. Works great if not too many of the passengers are streaming.


I've had a video call on a budget airline before.


Last time I flew back from the EU, I enjoyed the Panasonic system, until we entered US airspace. Frankly, I did not miss a beat. Experience was on par with many terrestrial wifi deals.

Then boom! Back to NoGo, limited craptastic, swore I would never pay for it, service.


from the same clip "How quickly the world owes him something he didn't knew existed 5 minutes ago?"


Baseline reset.


The problem is that we now have entire airlines shutting down due to technical failures. We have power failures that affect most of a state. 911 failures that affect multiple states. Entire retail chains going down due to a system outage. Wide area over-centralization failures are a new thing.



Not having enough human capacity for a random event is reasonable enough, but between their app and their website there should be a smooth, automated, and easy way for customers to rebook or get their money back to book with another airline. That shouldn't require going through a person, especially if there are not enough people available.


How long should that take, for every single person doing business with a company to change their plans?


If you give users the right self-service options, they'll resolve it for themselves between the start of the outage and when they expected to fly. They'll rebook or take their money and go.


I'd assume a nontrivial number of customers are calling because automated methods to reschedule aren't giving them the options that they want, and they assume (probably incorrectly) that they will do better when talking to a human.

I don't know with Southwest, but I know more than one airline has a system that sends automatic "choose one of these options" emails when flights are cancelled. When all of your customers need new flights, though, the issue isn't how the rescheduling is done, it's the fact that almost nobody is going to be able to choose an option they like.


> they assume (probably incorrectly) that they will do better when talking to a human.

Anecdotally, it has worked enough times for me to call/talk to an agent vs dealing with the automated suggestion system. Just don't be a jerk.

One automated system gave me a next flight of the following day. Spoke to an agent and was rerouted through another hub and only delayed by two hours overall.

In another case, a delayed flight caused me to miss my connection. Was officially told that I was out of luck and would have to wait (as were the jerks who asked) for the morning. I went and got a seat on the plane out that night.


Of the two other responses to this thread, one recommends "don't be a jerk" and the other recommends "complain enough".

I wonder if there just isn't a negative feedback loop here (everybody gets helped) so people just assume their way is the most effective.


> one recommends "don't be a jerk" and the other recommends "complain enough"

Those aren't mutually exclusive. You can complain a lot but not be a jerk. You just have to understand and acknowledge the situation the customer service rep is in but still be firm on what you want.


Exactly this, I feel like the "all of your customers" thing is important and what bothered me about the GP comment. I know it's tremendously inconvenient for people who are trying to travel, but it seems like a triumph of technology to be able to sort something like this out in less than a whole day? And to expect literally every customer to be able to jump on an app and get a refund or reschedule in the same hour strikes me as... possible, but not necessarily a thing we should expect?


In my experience if you complain enough the human agent will occasionally agree to rebook you on another airline even when the online system doesn't show that as an option.


That's called an automated re-accommodation system. They do have one. People call anyway.


Yes but who knows about that?

Flight got cancelled? Send me a text, email, whatever. They have all of my information. Text me a link.

The reality is that they don't do that because they'd lose money this way. It's best to underprovide so people who don't want to wait just go elsewhere.

When covid hit last year I had to spend literally hours on the phone trying to cancel/rebook because Alitalia did not allow this online. The low-cost EasyJet instead *sent me an email* with a link to get a refund. It's easy. Nobody had to wait for disaster to hit to think of a solution.

These are just (anti-user) choices that companies make to save money. Some users will still call, but much fewer.


> Flight got cancelled? Send me a text, email, whatever. They have all of my information. Text me a link.

Airlines always do it for me. Well, at least in the last handful of years I've been flying.


It sends a text to affected passengers with a link to the reaccom page.

I don't know about Alitalia. Not all airlines have these systems.


Would these be working during the giant computer outage?


People's experiences are mared by bad behavior on the airlines. Most airlines lock you into whatever you bought and put so many restrictions on it that it's ridicious. (extremely high change fees, baggage fees, etc) They give you a false option that you can buy out of these things. (Paying more money.. but often those options are obscenely over priced)

Different airlines handle misconnects differently: Delta sometimes auto rebooks, United and US Airways required you to call and rebook.. although US Airways required you to work with the gate or call before hand and you got issued a new ticket at the kisok. On top of this.. there are different rules you aren't explained about what conditions indicate you can break the ticket without being penalized. (Schedule change being one)

Lastly, theres the effect of that the airline never takes responsibility when they screw up. It's always "eh you should know better" .. Leftouthansa sends your bags to EWR on a FRA-LHR-ORD flight? Maybe you'll get your bags in 48 hours? Don't forget to ask for an amenity kit (that's on you to ask for). Bag got lost and you want reimburstment? Please keep your recepts from years ago for the jeans you had in there.


Southwest is actually really good about these sorts of things. You can rebook and apply the full ticket price to another flight (or at least that was the case last time I flew SW). Their normal customer service level is fantastic which makes the cases where it fails that much more dramatic.


People in shock don't think about the wider issues.

"I never understand people's expectations of other people's expectations..." Well, I guess I feel I understand that too.

"Be prepared, be patient..." Good advice, so is "don't have a global outage" but do people follow it? Nooooo


> Good advice, so is "don't have a global outage" but do people follow it?

People on HN should understand better than anyone that this isn't really possible over a long enough period of time. It's not like Southwest has to ground its fleet every other month. This specifically is a very very rare event.


Whenever you go on longer trips you need to do at least some disaster planning, otherwise you’ll get in some wild inconvenience sometime


What can you plan for exactly if you plane gets diverted to nowhere's ville?

Should you be carrying a tent and three days food and water? That would add to your carrying ons. The only thing I can think of is not making appointments on the other side where your life is screwed if your plane is diverted. But how much you can do that or need to do that varies by person.


Some underwear in carry on in case you stay at the airport. Some cash if your cards fail. Ticket bookings with not the cheapest companies in case you need to rebook the next (significant) leg.


Absolutely agreed. The airline industry has become victim of its own success there in a way. Planes are so amazingly safe and reliable and even punctual by and large that people forget that it’s not just a bus going from A to B, and how miraculous it really is. (Some stand-up comedian had a bit about that, I thought it was George Carlin, but I don’t find it.) ETA: Louis C K, yes, thanks.


Louis CK bit on Conan about people complaining about flying. Certainly relatable reading some passenger complaints regarding this Southwest debacle.

https://youtu.be/kBLkX2VaQs4?t=133


> fly airlines that have a track record of giving hotels for cancelations

In the US those are? United, Delta, and American for the most part don't. They'll look for any reason to claim WX issue so they won't pay out.


That's what I put "and" and not "or" after that part of the sentence. Weather is a big one that they don't usually pay out for. You need to be prepared for that to happen, either by being okay with the airport floor or have other ways to fund a hotel stay.

FWIW I had United give me a hotel in 2019 after a mechanical issue.


Uh, every airline I know sells round trip tickets at least a day apart. A considerable portion of their business is short, business trips. It is unreasonable for a service to have an SLA that allows outages larger than the duration of the service.

Sure weather is hard to predict but it is good enough to alert you the 3-4 hours before your flight so you would know prior to arriving at the airport. Computer systems can be made fully redundant, etc.

I agree that it's pointless to expect customer service to handle the load during these crisis precisely because these events should either be one per decade type events or at least be handled in a planned, graceful manner.

What would that look like? For every flight let customers at least check what the probability of delay/cancelation is - the airlines certainly have a number.

When a flight is canceled; book first and business class on alternative airlines/flights. Let economy/other customer's select alternative flight or request refund via zero human/robocall interaction process. If more than 50% of the flight's capacity has already checked in then alternative lodging should be mandatory and meals should be provided.

It's painfully obvious that the airlines know that as long as all of them provide terrible service then none of then will be expected too.


You're saying that from the perspective of someone who's not currently stranded at an airport. Your points are valid, but I think the irony is your expectations aren't reasonable.

These are people stuck in strange places who are anxious and uncomfortable. Some of them are probably going to miss important things. Others have to worry about kids/pets/etc. That's not to say they shouldn't be patient (and certainly they shouldn't be taking their frustrations out on the airline staff), but you're demanding rationality from people in the exact sort of situation where it can be really, really difficult for someone to be entirely rational.

They're trying to do what they can (call customer service) and find themselves helpless - that's not a situation where they're going to reason through the economics of call center staffing.


> You're saying that from the perspective of someone who's not currently stranded at an airport.

I'm saying it from the perspective of someone who has been stranded at an airport. It's very easy to chill tf out for a while and wait in line for a while at the service desk to see what you've got to work with. You're not dying, it's not literal fight or flight. You're just stuck for a while.

> Your points are valid, but I think the irony is your expectations aren't reasonable.

I don't think it's reasonable to scream at service reps, which I guarantee will happen to nearly every single Southwest rep today. I've seen it so many times.

And it's the people who are confused why the customer service lines are jammed, those who are irrational, who are more likely to be the screamers. That's mainly who my comment was directed at.


> These are people stuck in strange places who are anxious and uncomfortable.

Those people need to relax a little bit.

Just a generation ago they could have been driving down a remote road, in an unreliable car with no cell phones.

What would they do then?

Well they'd start walking and nobody on earth would know where they were.

Being stuck at an airport, you have something in your pocket you can use to inform everyone of what is going on, handle getting your pets taken care of, etc.


I was in the middle of the second outage today that significantly delayed a flight for me. I was fine.. I was just coming home from a trip so no big deal.

That said. I think people were understandably pretty pissed. I talked to a family that was taking their first vacation in 18 months. He only gets a handful of vacation days a year and due to cancellations their vacation has been ended. Just like that.

If I was them I'd be out of my skull. It'll probably be another full year before they get to try again.


I think is likely the big difference.

If you traveling for work.. no big deal Company may be out some money, you may be delayed for a meeting, etc etc

If you are traveling for personal... these things are not as easy to just rescheduled, push back, or change. The family likely also lost money on reservation especially if it was a "vacation package", which likely use the same excuse as Southwest so screw the family out of their money.


Some of the anxiety is probably driven by the fact that other people's expectations of you and your ability to keep to a tight schedule have changed from what they would've been back then, as a result of the improved reliability of these things.


I often see people mad about a delay because they booked a flight in order to get to a meeting an hour or two after arriving when the car trip would be at least a day long.


The retail veneer is a powerful edifice. Folks that don't have any experience with building or operating or manufacturing things at scale really have no conceptual footing on just how difficult it is. As a result, at least in my opinion, there's no reservoir of empathy to tap when things don't go according to plan.


Well dude, they're... stuck. They're stuck in the middle of nowhere. That feels horrible no matter what happens. It happened to me yesterday at MCO.


What? An airport in a US city is hardly “middle of nowhere”. There are toilets. Restaurants. Air conditioning. Lighting, running water and electricity. Bookstores and bars.


>do they expect them to always have enough customer service capacity for when every single customer in the world has reason to call them?

Yes, why wouldn't they, they just canceled their flights.


Hope they recover quickly. Southwest, in my experience, is the local (US) airline with the best customer service by far. And one of the best in the world.


Having flown SouthWest hundreds of times, their performance is quite good when things work. However, in my experience, when things fail their performance suffers worse than several other airlines. It think it's because they intentionally run their systems to maximize efficiency which means less resilience to unexpected failures.

Years ago, they compensated for this with generally lower prices. However, in more recent years, I've noticed their pricing isn't nearly as advantageous compared to competitors. Yes, if you're one of the first 20% to book a flight six weeks in advance, and can accept little flexibility in terms, and are willing to go at a less desirable time, there are still deals to be had. But in any other scenario I've found United, American, Delta, etc to sometimes actually be cheaper, which was previously rarely the case.


A mostly point to point flying network is also harder to recover than a mostly hub and spoke one.


They also have slightly more legroom than United, American, Delta, etc. That alone is enough for me to pick SW when I have a choice.


I will never like their boarding process, soo much like cattle driving for me


A physical process unable to reach the clouds which has been prevented by a logical process maintained in the clouds.

Who's on first. What's on second. I don't knows on third. Where are the clouds?


Next we'll get to hear that modernizing old software systems is too hard, so they're still running COBOL on a mainframe out of a single DC that had a fiber cut or something.


If anything, it's probably a newer part of the system that failed. The old stuff tends to stay around precisely because it's so reliable.


Also because anything new needs to go through a byzantine FAA approval process. That puts a strong bias towards keeping the old (and approved) systems online.


There’s a lot of that stuff out there still. I was speaking to a former colleague a couple of years back and he discovered a NT4 box with a Hayes modem dangling off it doing mission critical EDI stuff in VB4 while doing a site migration. Not come across anything that old yet. Last one was a single irreplaceable VAX doing DHCP for about 200 users about ten years after it should have been in a skip.


I had a consultant buddy ask me about FoxPro last year because he knew I had done some work with it at one point. Apparently all of payroll at some largeish company you've maybe heard of (~15k employees) was done using an ancient FoxPro Windows 3.11 version (2.6? 2.7? memory's rusty) running via compatibility on Windows 7 and they were just getting around to trying to replace it.

In a way people probably think things like that are insane but if you step back and look at it, if it was getting the job done all these years it was decent engineering.


You take your eye off the ball for 20 or 30 years and look what happens (throws hands in air)


The book Release It!, listing some techniques for enterprise software resiliency has airline booking systems going down as the first real-world horror story, and I have to say, that is any engineer's nightmare fuel.

I was paying attention after that.


Getting more fragile by the day.

An airline that can’t fly planes due to a third-parties weather app. (Which, for all we know, is dependent on other third-party apps.)

These problems will grow exponentially.

Just wait until Company A goes offline because it relies on Company B, which goes offline due to Company C, which (and this is where the fun begins...) goes offline because of Company A.


In a similar vein, I recall reading an anecdote a couple of years ago about an SMS provider and an alerting company that discovered they had a circular dependency on each other: the SMS company didn’t get paged about their service failing because the alert system couldn’t get any messages through.


If it’s a circular dependency then you still need an external event at some point to bring the system down.

I guess what you’re saying is what if the weather app crashes both Southwest and Kayak. And once the weather app goes back online if Southwest and Kayak were both dependent on each other then they couldn’t recover even though the original cause had been resolved.


Yeah. Any number of scenarios.

Of course, in the ‘real’ world, these kind of interdependencies exist all the time. The difference is the speed and scale with which these glitches perpetuate.

I mean, you had the same issues in the 1800s with a supply chain. But, a lumber mill in Canada that burns down would not have instantly caused houses to collapse in S. Carolina.

Nowadays, the problems happen fast. And, with brilliant IT types, the problems get resolved fast.

But, boy howdy, I get the feeling — just from reading the angst-ridden accounts from overworked stressed out techs types on HN — that we may be approaching a point where the ‘fixers’ are going to get overwhelmed by the bugs.


They were using the darksky api too? Kidding of course, but still sour grapes at Apple.


I’m what you might call an Apple fanboy but man did that tick me off. We’d been integrated with dark sky for so many years and had to migrate to the NWS(?) API. It’s free, but so much more painful to use than darksky. We lost so much functionality in the process. Boo, Apple.


Yeah the app and the api were amazing. I guess apple extended the api shutdown but man, buying it and shutting it down really really pisses me off.


"We can't get a weather forecast, so we're grounding our national fleet".

Listen, I am sure this is a rigorous, safety-critical procedure. Several comments already show very well some details about flight plans and contingencies, and so on.

So how in the flying fsck are there not multiple service providers for a piece of mission critical data like this? Shouldn't this be something provided by multiple vendors over multiple ISPs and hosted in multiple regions by multiple cloud providers?

How much money did they lose in the past two days because they didn't think this could happen?


Since other airlines apparently aren't having an issue, it would appear to be a problem with transmitting the weather data within Southwest's internal dispatch system. The weather data provider is likely up and running.


This is what happens when you treat IT as a cost center. Would love to see how many times the IT team urged higher-ups to prioritize and fund computer system modernization and upgrades.



Dang, no update for 2+ hours. Not a good look. I expect this from a GCP/AWS status page but that’s about it.


> In the meantime, Customer Service wait times might be longer than normal

Might?


Ramsonware?


Imagine a foreign adversary attacking flights in midair. Or shutting down energy, transportation, distribution, etc.

We rapidly need to reframe our security posture.


> Imagine a foreign adversary attacking flights in midair.

It would suck to have IT systems intentionally disrupted but I don't think even modern airliners can plausibly fall out of the sky even if they suddenly lose contact with all ground based systems.

I think there are procedures in place even if air-traffic control systems, comms or GPS fail. Last time I checked, pilots can still fly the plane with their eyeballs, on-board radar, compass and ground speed. In fact, I recently saw an article where some senior safety pilot was concerned that with all the nifty automation, today's pilots aren't getting enough practice doing, you know, actual flying of the airplane.


I’m 99% sure that losing the automatic systems would immediately result in the situation foretold here: https://www.gocomics.com/calvinandhobbes/1988/05/15


Sure, but if you don’t use a skill you lose it. There’s a lot of math I aced in school with pencil and paper that I’ve done for years since with a calculator that would be much harder for me now than it was when I originally learned the skill.


> suddenly lose contact with all ground based systems

isn't that one of the more benign scenarios? staying connected but getting fed malicious incorrect info seems much more dangerous.


IOT could be a big issue which looks like a solution to something actually noone wanted.

Like that twitter acc that collects stuff like a shoe that needs firmware updates


@internetofshit for the curious. some pretty goofy stuff.


We need to reframe the incentives to reframe our security posture.


y'know between the oil pipeline, the meat processing plant, and now the airlines, its feeling very like ministry-for-the-future lately.


Don't forget the ransomware attack at hospitals in Florida last week. (AFAIK they paid the ransom)


I wonder what kind of computer issue requires them to ground their whole fleet? Can’t the pilots just use Foreflight for the weather?


For liability reasons, I assume if they can't use their standard systems, they can't fly.

Commercial flight safety is all about standardization, reproducibility, and predictability. Switching the weather source for the pilots violates all three of those principles.


Brings to mind the sinking of the El Faro. The captain was relying on the free version of a paid weather app that gated up to date information. The 12 hours out of date information showed the planned/filed path to be safe, but in reality the captain sailed straight into the center of a hurricane, and sank with all hands.

Many other factors contributed to the sinking, but one of the NTSB bulletpoints was to rely on official weather sources only. I wonder if that directive played a role here.

https://news.ycombinator.com/item?id=15554602

https://news.ycombinator.com/item?id=17160396

https://en.wikipedia.org/wiki/SS_El_Faro


I often wonder if delays like this should also be treated as some kind of safety issue.

Somehow we've decided that any loss of life or injury via a crash is unacceptable. Yet we treat delays (both airline and traffic) that cost millions of human life hours as barely newsworthy.

Would it be better if we considered policies in terms of human life hours saved rather than human lives? Is that already being done?


The risk of loss of a passenger aircraft isn't the same thing as not wearing a seatbelt or being late for a party. Nobody sane would want to be personally responsible for the death of 200 or 300 people in one go. The FAA has a no commercial aviation accident policy and I for one support that policy.


Let's also keep in mind that airlines aren't making the calculation of balancing delay-hours against lives-lost. They are making the calculation of lives-lost and delay-hours against dollars.

Meanwhile, passengers largely assume they won't die regardless of which airline they pick, and happily take absurd delays and unnecessary connections in exchange for lower ticket prices time and time again.


The number of people that are ok dying to prevent "lost human life hours" is probably near zero.


That's an actuarial solution. Not without merit, but IIUC the thinking is the public will not abide a crash where the root cause traces to "yeah it was a little unsafe, but think of the time lost if we'd delayed a day!"


I mean OSHA basically exists because people are really REALLY bad at making the time/disaster tradeoff calculation.


No, OSHA exists because liability doesn't completely internalize the harms of disasters to those responsible for working conditions, so other controls are necessary to prevent their being a rational net incentive to knowingly risk disaster.


The optionality value of not being dead is surprisingly high, meaning that most folks would trade 1 hour of not-death for many hours of not-delayed.


Shouldn't the equation be how many hours of delay you would trade for one hour of not-death?

Taken to an extreme, suppose we found 5 hours worth of airline safety video prior to a flight saved 10 human lives per year as it made it less likely people would evacuate the plane with their belongings.

If you led the FAA, would you mandate each passenger watch this video before each flight in order to save those lives? Put another way, is 5 hours * 5 billion airline passengers / year worth 10 humans lives / year?


If I led the FAA, I'd stay the course. Being dead is infinite. Delays due to safety are not.


At some point the total life duration cost of the safety actions exceeds the sum of the extended lives due to the safety actions. In the 5 billion hours example, 10 people live for under 10 million hours in total (1 million hours = 114 years). You'd have spent 5000 lives worth of time to save 10, which is not worth it in the aggregate.


That's assuming that all of the people that aren't traveling have just been put in stasis or something. People waiting for their plane when it's delayed go on about their lives. They're not on plan A, but they're still living.

People who die in a plane crash don't get to execute on plans A through Z.

Even if you run the numbers as a comparison of hours lost waiting for your plane to years lost from somebody's actuarial life expectancy because they died in a crash, it's not a one for one comparison.


Yes, this is a good point, an hour of being dead has zero utility whereas an hour of being delayed has at least partial utility.


When an airline has a crash, its stock price drops 1.58% that day on average. [0] Southwest's stock is down 0.28% today, and it looks like most of that drop happened immediately after the market opened, well before this incident, and may be related to today's stocks being down 0.21% overall. At first glance it appears that investors are entirely unconcerned with this delay.

For whatever reason, we as a society have decided that airliner crashes are scarier than car crashes. Just like we've decided terrorism, nuclear energy, and self driving cars are scarier than they actually are. Should we be this way? Probably not, but this is the way the world is.

[0] https://dx.doi.org/10.2139/ssrn.1785906


Yes, I agree the world works that way, people are poor at risk assessment.

My question though is how should it work? What's the most optimal tradeoff between delays vs lives lost?


I don’t have numbers, but intuitively speaking: Wouldn’t the cost differential in saving lives come down over time as the technology and procedures mature and amortize?

In other words, wouldn’t the calculation need to be treated much like any capital investment with net benefits needing some time to kick in?


You would hope, though, that what is standardized would include redundancy; resiliency has to be a concern at some level.


A system that is wholly dependent upon a third party weather application with no redundancies seems poorly built. I wish we'd get more transparency out of airlines when "computer systems" crash and cause delays. A post-mortem would be nice...


Say people with the skills and incentives to build these systems all go to faangs where they can make 500k/yr and have the budget to build the system since that system is the revenue generator.

Everywhere else it is 150k-200k a year with the bare number headcount and you are a cost center and treated like one.

So not surprising to me that every company in the world that has a large physical plant, so already has large single points of failure, doesn’t have redundant data centers or cloud based systems.

Given the number of hacks anything online literally can’t be secured, so they are damned if they do and damned if they don’t.

One answer is to go back to analog systems, which ain’t happening!


I'm sorry, but it is laughable to imply that a team of developers each making 200k a year can't build a stable product. It also isn't like the FAANGs of the world don't have downtime themselves.


I’m in industrial control and the move fast and break things never ending stream of updates subscription software mentality has taken hold of the big players like Rockwell and Schneider.

Mediocre software is eating the world


I was going to say the multiple failures at high profile cloud systems just in the last year is enough to prove that all the FAANG engineers in the world won’t save you.


Why would tech companies be more rational at pricing labor than other companies?

Because that's what it is - if your entire fleet is going down and you could have spent more to hire engineers, you just are irrational at pricing the value of engineering labor.


Yet every other airline is up and running, having (presumably) spent roughly the same on engineers.

It’s always easy in hindsight to say they should have spent more, but opportunity cost is very real, and margins are very slim.


"Yet every other airline is up and running, having (presumably) spent roughly the same on engineers"

That's an odd line of thinking. Other airlines have catastrophic outages too. Just at different times/dates. They do sometimes share 3rd party dependencies, but that happened to not be the case this time.


Seems like a profit center to me if the company is 100% shut down without it.


Actually that's much closer to the definition of a cost center.

A cost is a thing you need to pay but if you spend more on it you get very little benefit. Thus it has a binary nature or at least a nature where "good enough" is all you need. A profit center is something where if you make it better you generate more revenue. Therefore things like necessary infrastructure is a cost, whereas adding more routes is a profit.

Whereas I think you are confusing cost and profit to mean "important" and "unimportant". Not having your headquarters fall down is important. Having HVAC for your office workers is important. But no one is going to say "Let's choose this airline, have you seen how awesome the HVAC in their main office is? And how the foundation to their HQ is going to last a hundred years longer than their competitor?" But they will say "Let's choose this airline because it has a direct flight to where we want to go".


Thank you. In retrospect, I had a feeling something was off about my thinking, but I couldn't articulate where. Importance and profitability are different domains. They might penny pinch weather reports, and risk management probably characterized it as "low risk, high impact", saw "low risk" and underinvested.


Southwest's stock price seems to be unaffected by the outage. Southwest is down 0.28% on the day, but most of that drop happened at market open, before this incident, and is probably related to the S&P 500 being down 0.21% over a similar time period. Southwest's stock stabilized at shortly before 10am, and is actually up almost a tenth since then.

Seems to me the MBAs ran the numbers a while ago, and here we are.


The news wasn’t priced in yet. Stocks are usually not priced rationally and there is plenty of room for someone good at performing accurate corporate valuations to become handsomely rich. Case in point: Enphase (ENPH). How can a stock of a predictable solar-parts company be priced at $130 then $220, then $120, then $164? All within a 7 month period. Finally what MBAs are you referring to?


200k a year at these companies is a stretch.


"Software is eating the world."


Seems like another ransomware event.


I'd bet a hundred dollars that it's ransomware.


I just flew in from MIA to BWI (flight 1488 if you're curious) from Southwest last night. I am so sorry for the people effected by this. Yesterday was supposed to be my start date at a company I previously worked for and I had to delay by a day because Frontier Airlines, another garbage low-cost airline, cancelled a flight due to "weather" (there was no weather issues) and then refused to rebook with another carrier because of the aforementioned weather issues. Absolute bullshit.

Frontier was dirt cheap, so ok whatever, but Southwest basically chargerg 535 bucks for a 2.5 hour flight from major city to major city and then didn't even let people drink on it. They even gave us free alcoholic beverage coupons that were then invalidated. Complete bullshit.

Definitely one of my top 5 least favorite flying experiences. Flying is such a shit show these days. It used to be so calm. I can't believe airlines can charge half a g and also not let you drink. I get stressed a bit during flying and they just chuckled and kept walking. Fuck the airlines. It is egregious that this covid culture has resulted in this insane fear of interacting with other people.

I think when I move to Miami in September I am not going to fly, I'll take a train with the highest possible class just to get around the absolute garbage experience that flying has become. It'll take 24 hours but maybe, just maybe, it won't be Kafkaesque


Several airlines have cut alcohol service because their employees were being abused. They continued to serve alcohol through the height of Covid, so blaming "covid culture" is counter factual. Airport bars exist. If you can't manage to make it through the horrible travail of 2.5 hours without alcohol, feel free to drink up before boarding. Or bring some on the plane with you. Or pop a Xanax. Finally, $500 to safely fly from one city to another is not expensive. Planes cost >$100 million to buy. Pilots are well trained and expensive. Ground crews, maintenance, flight plans, etc. If you don't like flying - please take the train. Nobody is stopping you. But if you do fly, you'll be happier (and the people around you will be happier) if you stop complaining.


Come on, I'm just saying that banning everyone from a typical experience just because there are a few bad actors makes flying extremely irritating. Maybe ban those people instead of making the experience terrible for the rest of us.


It's more than just a few bad actors. From 06 May 2021:

New statistics from the Federal Aviation Administration show a rapid rise in plane rage incidences.

The FAA says it’s gotten 1,300 reports in the past three months, as the number of passengers remains below pre-pandemic levels.

...

In a typical year, the FAA sees anywhere from 100 to 150 cases - only a fraction of those seen since February.

Source: https://www.msn.com/en-us/news/us/faa-reports-rise-in-plane-...


Thanks for the reference. I just read through parts of it. Two part question:

1) So what is wrong with all these psychos?

and

2) and what does any of this have to do with drinking? All of these cases seem to be people either retaliating against former employers or just not wanting to wear a mask. Banning drinking seems to be a band-aid that doesn't solve the issue at all, and doesn't even address the core issue at hand.


Agreed that it sucks when a few bad actors create the need for general restrictions. But that's certainly the case for why we have airport security. Or, in fact any security - anywhere. I look forward to living in your world where either everyone acts well, or we all just tolerate the minor inconvenience of the "bad actors" running wild.



Yes or ban those people from purchasing alcoholic drinks — depending on the incident.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: