Three minutes without breathable air
Lesson XIII of the Lunar Landings
As described in an earlier article, a little over 50 years ago, on April 14th, 1970, an explosion blew out the oxygen tanks of Apollo 13’s command/service module Odyssey when they were halfway to the Moon.
Instead of landing on the Moon, the astronauts would be in survival mode, flying a spacecraft that could just barely keep them alive for three and a half days until they landed safely back on Earth.
During the trip back to Earth the Odyssey was placed in hibernation mode while the astronauts lived in the smaller Aquarius module, which had been designed to support two astronauts for the two days it would take them to land on the Moon. Supporting three astronauts for three and a half days would be stretching the available resources to their breaking point.
There is a common rule of thumb which states:
You can survive three minutes without breathable air.
You can survive three hours in a harsh environment.
You can survive three days without drinkable water.
You can survive three weeks without food.
Food was not a concern (certainly compared to the other issues). There was just enough water to last the astronauts if they rationed themselves, and NASA’s engineers figured out how to conserve enough battery power by shutting down some of Aquarius’ systems while keeping the astronauts alive.
Multiple other issues had to be solved before the astronauts returned safely — navigation issues, spacecraft maneuvering issues, re-animating Odyssey before the landing, and more.
But the problem which loomed largest was whether there would be enough air for the astronauts to breathe. While the oxygen supplies would last the trip, one key function of the spacecraft was filtering out the carbon dioxide the astronauts were constantly exhaling into the confines of the capsule. Both Odyssey and Aquarius were equipped with filtration systems (known as scrubbers) which cleansed the air in the spacecraft and kept the carbon dioxide levels low. Whenever carbon dioxide built up in the filtration system and the filters became saturated, they needed to be replaced. The active compound was lithium hydroxide and it was stored in replaceable canisters.
Therein lay the problem — the active system was that of Aquarius, and its lithium hydroxide canisters were stored in external lockers, only available once the spacecraft landed on the Moon. And while the canisters on the Odyssey were accessible, they had never been designed to fit in Aquarius and were the wrong shape to fit anyway.
While NASA’s engineers had tried to plan for every conceivable contingency, dealing with an explosion of the magnitude which hit Apollo 13 was not one of them. As a matter of fact, during planning sessions before the flight, they had concluded that an explosion that might damage the spacecraft to the extent which Apollo 13 was damaged would destroy it completely and there was no point training how to return the astronauts in that case!
But, against the odds, Apollo 13 had survived the explosion and the three astronauts were hurtling back to Earth — and NASA’s engineers needed to find a way to enable them to breathe till they got back.
Each issue had a team responsible for solving the problem and the team in charge of the carbon-dioxide issue wracked their brains to literally “fit a square canister in a round hole”! Fortunately, NASA was prepared for any eventuality and they had practiced something, which while not the same contingency, was close enough to work.
During the preparation for Apollo 8 in 1968, one of the scenarios they had trained for was that of failure in the carbon dioxide filtration system. In that case, the problem had not been a complete shutdown of the spacecraft, but a minor mechanical failure in a bolt or a screw which merely meant that the canister could not be changed. During the simulated exercise, the engineers managed to design a jury-rig to use a hose and vacuum system to hook up a square canister to a square hole. Now, in a real-world emergency, they dusted off this playbook and tried it again with the difference that the canister and hole now belonged to two different spacecraft modules.
Using some understatement, astronaut Ken Mattingly explains:
On 13 someone says, “You remember what we did on that [training simulation]? Who did that?” So in nothing short, [they] showed up, and we talked about “How did you build that bag and what did you do?”
Oh, it was easy. Solving that problem took an hour, maybe two. Because it’s real now, they made him build a demonstration model, so that took another thirty minutes… Of course it worked like a gem.¹
While I am presenting the creation of the new system as simple, it was most assuredly not. But the team of engineers (lead by Ed Smylie²) had the benefit of confidence that they had already solved an equivalent problem during the training periods before flights.
The resilience and reliability of Apollo (after all, the astronauts did survive a crippling explosion and returned safely to Earth) was a result of both the technology and engineering of the spacecraft and also, even more importantly, the skill and preparation of the human factor. Culture, people, and processes were even stronger than the steel of which the spacecraft was made.
The fact of the matter is that it’s not that the Apollo engineers had a contingency plan for every possible eventuality, but they had enough planning and enough familiarity with the systems that they could adopt and adapt any existing routine and make it work for other scenarios too. The rescue of Apollo 13 was a master class in the value of taking thousands of training hours which resulted in mountains of documented problems and their solutions.
The fact that the astronauts themselves had been involved in developing these contingency plans together with the engineers who remained on the ground meant that, in the hour of crisis, the two teams had already developed a common language and communicated easily and smoothly.
If you look at the recently published IBM Cloud DevOps Reference Architecture, you’ll see how this teamwork and continuous testing as performed by NASA engineers and astronauts on their way to the Moon can be reflected in modern DevOps practices too. Testing every step of the way ensures that when the unexpected occurs, you and your team will be ready to solve the issues uncovered.
Future articles will discuss how some of the other issues the astronauts needed to overcome such as navigation (how do you aim the spacecraft when the computers are offline?), restarting the hibernating Odyssey (one of the very few tasks for which there was no contingency plan to adopt) and others.
If you’re interested in watching and listening instead of just reading, I’ll be presenting some of these lessons in an upcoming IBM conference:
PREVAIL 2020 is organized by and for technical IT professionals who are passionate about IT resilience, security, and performance. We offer a large variety of keynotes, breakout sessions, panels, workshops, and posters from the best subject matter experts in the field.
In recent times we have seen a growing interest in Agile ways of working, DevOps pipelines and toolchains, microservices development and deployment onto cloud-native container-platforms. The promises of these new methods and computing paradigms are manifold but as solution complexity increases the non-functional aspects might suffer.
PREVAIL 2020 will be held on a follow-the-sun basis starting September 14th in the Americas and running for three consecutive days. All sessions will be recorded so you don’t need to miss anything.
Further articles in this series:
For future lessons and articles, follow me here as Robert Barron, or as @flyingbarron on Twitter or Linkedin.
Bring your plan to the IBM Garage.
IBM Garage is built for moving faster, working smarter, and innovating in a way that lets you disrupt disruption.
Learn more at www.ibm.com/garage
- NASA Johnson Space Center Oral History Project — Thomas K. Mattingly interview, 2001
- Apollo 13, Jim Lovell & Jeffrey Kluger. 1995