We've all seen the news when an airplane crashes. Sometimes there are survivors, and sadly, more often than not, they’re aren’t.
But what happens when an airplane crashes, but no one knows where it crashed? How do rescue teams know where to look? How can they recover critical evidence they need to understand what happened?
In the absence of physical proof, teams turn to data analytics to understand what happened in those final minutes. In the tragedies of Air France Flight AF447 and Malaysia Airlines flight MH370, data analytics was critical in the search and investigation of the loss of both aircrafts.
First, let’s look at the accidents.
Air France flight AF447 (the plane itself was an Airbus A330 model) was a red-eye flight, leaving from Rio de Janeiro, Brazil on Sunday 31 May 2009, and scheduled to arrive in Paris, France the following morning. Almost four hours after its departure in Brazil, Air France Flight 447 disappeared over the South Atlantic Ocean on 1 June 2009. There were twelve crew members (3 flight crew, 9 cabin crew) and 216 passengers on board.
Malaysia Airlines Flight 370 (a Boeing 777 model) operated from Kuala Lumpur, Malaysia to Beijing, China. The flight time was 5 hours and 34 minutes from gate to gate. However, on 8 March 2014, the plane seemed to abandon its route almost immediately, going off-course over the Indian Ocean. Less than an hour after take-off, MH370 made its last radio contact, carrying 12 crew and 227 passengers.
Without knowing if anyone survived, teams employed a variety of data sources to help track down the location of the aircrafts and began a rescue. These include physical factors, conditional factors, and inferential data.
Physical: locating the debris and its condition as well as the physical debris field
Conditional: wind, ocean currents, radio and satellite signals, and time
Inferential: approximate altitude, airspeed, fuel burn, and weight of the aircraft
Unfortunately, neither crash had survivors. Five days after AF447 crashed, recovery crews discovered two bodies and within two weeks, crews recovered 50 bodies over a wide area of ocean. Recovery teams have failed to find any bodies from MH370, although a few pieces of the aircraft itself washed up in Africa.
For both aircraft accidents, after gathering all sources of available data, the primary approach was then to employ Bayesian statistics. Bayesian statistics is a theory in the field of statistics based on the interpretation of Bayesian probability where probability expresses a degree of belief in an event. Translated: we know a crash happened, we just need to find where.
In Bayesian statistics, the key point is to use this negative evidence as a component to modify the probability distribution, i.e. since it’s not located here, the probably of it being located somewhere else increases. Search teams eliminate “known no’s” and focus in on non-eliminated places that will hopefully contain a ‘yes.’
Using prior data (i.e. data obtained from the plane before the crash), teams have a certain probability distribution for the location of the wreckage. What did the satellite navigation systems say and when? What were the weather patterns that night? What was the communication from the pilots? Even the formation of a debris field compared against wind drift and ocean currents also played a significant part in the eventual location of the aircraft.
Along with prior data, teams used SONAR and canvassed underwater. When their efforts continued to fail to find the wreckage, a US company conducted a Bayesian analysis of the available data. By eliminating the “no’s”, teams found Air France flight AF447’s wreckage, on 2 April 2011, nearly two years after its crash, just off the coast of Brazil, which was close to the ﬁnal reported location of the aircraft.
Last known position of the aircraft, intended flight path and a 40 NM circle.
All floating debris (found between 6 and 26 June 2009), last known position and wreckage site.
The analysis of the detection and the effectiveness of each search component produced by the Bayesian posterior distributions formed a solid basis for planning the next search phase. In fact, the 2011 search commenced in the center of the distribution area and quickly found the wreckage within a matter of weeks. This methodology is instrumental in the search for Malaysia Airlines Flight 370.
When MH370 vanished on 8 March 2014, teams immediately began to search for the missing aircraft. The search continued for a whopping 1,046 days until the Governments of Malaysia, Australia, and the People’s Republic of China suspended the search in mid-January 2017.
MH370’s disappearance remains shrouded in mystery as it appears that someone intentionally turned off the transponder and satellite uplinks, meaning someone purposely eliminated data sources from tracking the flight. Furthermore, unlike the Air France accident, there is very little physical data available.
Teams never saw floating debris, so the use of wind drift is relatively meaningless.
Side scan sonar, towed array sonar, synthetic aperture radar, and the use of underwater autonomous vehicles have come up empty-handed, despite mapping over 710,000 square kilometers of the Indian Ocean.
The data available from the accident flight consists of mostly ground to air communication messages at approximately one-hour intervals. It also contains a lone, single ping from the First Officer’s cellphone.
So, what can teams turn use to find the wreck? Bayesian methods.
For roughly six hours after passing the final radar point, Inmarsat satellites were able to use data analysis to determine the approximate flight path by looking at the time, length, and frequency of each ping and communication – called a handshake, or “burst” in the aviation world.
By plotting these bursts onto a map, government officials identified seven rings as the basis for deriving a possible location.
Using Doppler radar and assumed locations based on aircraft performance, all of the data points indicate that the aircraft traveled at or beyond a general location, shown below.
After ruling out the least areas of probability, it creates a new refined area of interest. These plots are based on performance values provided by Boeing, looking at autopilot settings, normal handling of aircraft, and a high-performance flight track.
Although reduced by seven times, the search area still exceeds 100,000 square kilometers, and depths few have ever encountered. Data scientists remain hopeful that the aircraft will be discovered soon, and authorities hope to someday understand the disappearance of MH370.
By leveraging all data sources available, rescue teams and aviation experts can focus their efforts on active recovery and reconstruction, as well as developing future risk mitigation strategies and better understanding human factors. Data usage is critical is tying theory and the hypothetical to reality through the creation of actionable insight.