Flying to the Moon from the Backroom — Mission Control
Lesson XVIII from the lunar landings
The most famous words spoken on the Moon begin, “That’s one small step…”, but close behind are “Houston, Tranquility Base here. The Eagle has landed.”
For Neil Armstrong, astronaut and engineer, the actual landing of the spacecraft was personally of more importance and historical relevance than his first steps. After all, he was first and foremost a pilot…
“Pilots take no special joy in walking: pilots like flying. Pilots generally take pride in a good landing, not in getting out of the vehicle.
— Neil Armstrong
While “one small step” was directed at all of humanity, when Armstrong gave the “Houston” quote, he was addressing the rest of the NASA team back on Earth. I have already described parts of the Mission Operations Control Room (aka Houston aka Mission Control) in a previous article, but it’s worth going into some further details.
Everything in space flight was pre-planned down to the minute — when the astronauts flipped each switch and when the spacecraft made any maneuver.
While the astronauts performed the mission in space, there were endless tasks performed on Earth to make sure that the mission succeeded. Some of these were reactive tasks, such as monitoring the state of the spacecraft and making corrections (or directing the astronauts to make them) whenever a problem came up. Others were proactive or scheduled tasks, such as periodically making changes to the spacecraft to avoid potential problems. The Apollo 13 disaster was triggered when Mission Control directed the astronauts to stir the liquid oxygen tanks — ordinarily, a routine task designed to smooth out the liquid and make it easier to measure. Another type of task was predictive — the flight controllers would constantly be recalculating the next best action necessary, depending on the recent changes in the mission.
If, for example, a controller performing a reactive monitoring task saw that the spacecraft was slightly off course, he would direct the astronauts to correct the course — but that would consume more fuel than originally planned, and then a predictive task would be triggered to make sure that the next steps in the mission plan were still correct since the spacecraft was now flying at a slightly different angle, with a different weight and less fuel than planned for this exact moment.
Each station in Mission Control was managed by a different team of engineers, responsible for different aspects of the mission. They worked in shifts so that the station (or console) would be occupied throughout the mission. Specific stations would be unoccupied if not relevant to the mission (for example, BOOSTER was no longer needed after the Saturn rocket had completed its mission to launch the Apollo spacecraft to the Moon).
Today’s Mission Control is similar in concept, but with different consoles dedicated to the different systems in the International Space Station (The telephone booths for the press are presumably no longer present).
The concept of Mission Control managing a flight is very similar to that of a modern war room managing a major operational incident — in both cases, we bring engineers together from different domains to solve a problem as quickly as possible.
The engineers sitting in the heart of Mission Control were not the only ones involved in solving problems as they occurred during a flight. In fact, usually, the flight controllers would perform the initial triage and damage control — making sure that the flight could continue despite the problem — and hand over responsibility of the actual solution of the problem to other teams in the building which would find the longer-term resolutions for the problem.
Each station in Mission Control had a “backroom” where many more subject matter experts could sit and investigate the problem. Since they had rooms and were not behind a single console, they had more access to documentation and necessary tools.
The two pictures above both show engineers and scientists examining seismographic information from the Moon (part of the Saturn V, which launched Apollo 14, was purposely crashed onto the Moon as a calibration experiment for the seismographs). On the left, in the Mission Operations Control Room, the information is limited to a single CRT display, while in a support back room in the building, the scientists can more easily examine, compare and get insights from the available data.
And so it went for other stations in Mission Control. When the flight dynamics officer needed advanced calculations, he did not go down to the IBM mainframes on the lower floors of the building himself but had his support team perform the calculations for him. What he needed were the results and the insights which the computer could generate, not to operate the computer himself.
Many flight controllers began their careers in the back rooms, supporting their more experienced peers before they performed the same role in future flights. Other support engineers came from the companies which had built the spacecraft components for NASA, such as North American Aviation, Boeing, Grumman, IBM, MIT, and many more.
Communication between the flight controllers and the back room was mostly by voice. Each controller had a headset and would listen to multiple simultaneous conversations — between himself and his backroom team, between the astronauts and the ground, between the flight director (who was orchestrating everyone), and more. It was a cacophony of information that they trained to manage. Another, more dramatic, form of communication and collaboration would be a series of pneumatic tubes, where the controllers could roll up a written message or computer printout, place it in a plastic tube-like container and “shoot” it to another room in the building using a series of pipes and air-powered launchers.
If the controller saw something interesting on their console, they could press a button and, down in the basement, a printer would spit out a copy of their screen at that moment on thermal paper. The paper would then be loaded into a “p-tube” and sent to the console.
This would allow the flight controller to compare and contrast current data available on the console with historical data. Still, a deeper investigation would depend on the backroom support team — there was simply too much to do and not enough computing power available to manage all the work from the single console.
In today’s world of operations, we are still aiming to reduce the number of people involved in problem-solving — not raise them! We do this by leveraging solutions that were not available to NASA engineers in the Sixties: more powerful computers and AI capabilities. The improvements in AI Operations (AIOps) in the last few years mean that, more and more, we can use AI solutions to give us insight into problems in our production environments instead of needing to consult with other experts. AIOps means that a computer can tease out insights, find anomalies and aberrations much earlier than a human could interpret these same signals.
While NASA’s flight controllers had no choice but to depend on their fellow humans, today’s Site Reliability Engineers have solutions such as Instana and Cloud Pak for Watson AIOps, which are their backroom support.
By leveraging AI, SREs are freed from the necessity to perform reactive monitoring tasks, looking for problems manually before they occur. The algorithms and machine learning models embedded in the products do that automatically. While not discussed in this article, the AI is not only reactive but also performs the proactive and predictive tasks —giving the engineers recommendations as to what is the best solution for a given problem; allowing more time for deeper thought and less toil and searching for documents.
And instead of forcing the SREs to handle multiple conversations simultaneously, Watson AIOps makes its insights available in a “virtual war room,” making it a natural part of the SRE’s tool chest instead of “yet another tool to use.”
In further articles, I’ll drill down into some more aspects of Mission Control and show how a solution like Watson AIOps can make life so much easier for modern-day engineers.
If you can’t wait and want to learn more now, then you can go to https://www.ibm.com/watson/aiops-overview/ and https://www.ibm.com/products/watson-aiops for further information.
Alas, Watson AIOps does not include pneumatic tubes, so you’ll need to find a different way to propel paper at a neighboring engineer…
Articles in this series:
Learn more at www.ibm.com/garage