Airlines Must Improve IT Infrastructure Now or Pay Later  

Opinion

At the tail end of this past holiday season—on one of the busiest travel days of the year—thousands of arriving international passengers found themselves stuck for hours in the line at customs, waiting to be processed.

It wasn’t a terror alert, mechanical failure, or nefarious cyberattack that caused long lines and huge delays for travelers on January 2nd. The mundane truth? The processing system of U.S. Customs and Border Protection experienced a four-hour outage due to what was described as a “technical glitch.”

An outage doesn’t have to be maliciously caused for the effects to be catastrophic. The very word “glitch” seems insufficient to describe anything capable of grounding air traffic and flooding terminals with hundreds of frustrated travelers, which happened again on January 22 when an outage of United Airlines’ computer systems caused by a sudden shortage of bandwidth grounded all domestic flights for over two hours.

While the threats of terrorism and cyberattacks are likely to remain a reality for the transportation industry in 2017, the dangers far less heralded—yet even more likely to occur—are outages caused by the sort of routine, technical misconfigurations likely to affect virtually any IT environment of any size. Given the increasing reliance upon information technology to accomplish processes such as airline flight scheduling and customs control for transportation, outages of such systems have never had a greater potential for wreaking havoc.

How then can airlines begin to address these technical shortcomings and win the battle against outages? The cold truth is: airlines must make the investment and pay now to update their systems with integrity monitoring systems capable of giving a full view of every faulty router or malfunctioning server.

If not, they will pay a steep price later, as they try to restore consumer confidence and market share.

Airlines should proceed as if outages are an inevitable cost of doing business, and thus, work to improve the speed and agility of their response to a breakdown to make that cost as low as possible.

Airline IT infrastructures can be riddled with misconfigurations when they combine layers of legacy systems such as Sabre, the airline booking computer system which dates back to the early 1960s.

Other ingredients in the recipe for outages are a lack of testing for the effects of planned changes, and dueling processes which, in addition to duplicating work, might create lethal contradictions and misconfigurations.

Certainly, digital processors do the complex, arduous work for airlines remarkably well most of the time, and on a time-sensitive, constantly changing basis. But it only takes one bad outage to bring air travel to a halt and sink a fiscal quarter.

It is in the transportation industry’s best financial interest, to say nothing of their customers’ safety and wellbeing, to ensure that they take every reasonable precaution against a damaging outage.

The challenge can certainly seem daunting, particularly for an industry only now beginning to invest in improving its IT infrastructure. But the challenge can be met, and it’s well worth the effort.

Airlines and public agencies servicing transportation, such as U.S. Customs and Border Protection and the Transportation Security Administration, should develop full insight into their IT infrastructures so that they can identify any budding misconfigurations, and easily remediate them before they can cause any trouble. It is a far cheaper investment than the one airlines will make into furiously attempting to unravel a system outage causing thousands of flight cancellations.

As United has learned, the fallout from such a failure can bring down serious regulatory scrutiny, but this is hardly the most damaging effect. Any suspension of service can not only be hugely expensive, but can also severely degrade a name-brand reputation. A 2012 outage at online retailer Amazon likely cost the giant firm as much as $66,000 of lost revenue per minute of downtime. Rather than a mere annoyance, outages caused by preventable system issues can affect a company’s bottom line as devastatingly as any attack.

Configuration drift is the unfortunate reality of any large-scale system, such as that of U.S. Customs and Border Protection. Although the problems of maintaining and servicing large-scale digital processes are increasingly widespread, the risks that misconfigurations pose have only gotten larger. A mere five-hour power outage at Delta Airlines’ Atlanta headquarters in August 2016 cost the carrier at least $150 million in lost revenue and the cancellation of over three thousand flights. A July 2013 outage at Southwest Airlines, caused by a faulty router, accounted for as much as $82 million in losses for the budget company, with more than two thousand flights canceled. And in April 2013, a failure of American Airlines’ computerized booking system forced the grounding of all of the fleet’s flights—a massive outage likely caused by unplanned system changes or a malfunctioning third-party update.

As with Delta and Southwest, this last outage wasn’t merely a customer service nightmare, but also posed a major threat to the bottom line. Emerging from bankruptcy and in the process of merging with US Airways, American Airlines was attempting to recover its footing when the outage came. As explained by the Associated Press, these outages all seemed borne of some shared problems:

“Airline technology systems have hundreds of programs that are often of different ages and sources and are layered on top of each other. After recent outages at other carriers, outside experts have questioned whether airlines have enough redundancy in their systems and test the systems frequently enough.”

It doesn’t take a team of skilled, malicious hackers to do what a neglected software update, misconfiguration, or faulty patch can do to an airline. But the effects may be identical. By focusing on how to prevent such outages proactively, and making the business decision to invest in information security and intelligent system management, firms in the airline industry, and every industry for that matter, can minimize the risk posed by factors entirely under their control.

Mike Baukes is co-CEO of UpGuard, a cybersecurity company based in Mountain View, CA, that helps businesses identify risks and prevent breaches. Follow @mikebaukes

Trending on Xconomy