The 84th General Meeting Featured Presentation
James F. Reilly, II, Ph.D.
Managing Safety in Spaceflight: Recognizing Narrow Margins
The following presentation was delivered at the 84th General Meeting Monday afternoon session, April 27. It has been edited for content and phrasing.
Astronaut James Reilly retired from NASA in 2008, but not before experiencing a stellar career that saw him log over 853 hours in space. He also conducted five spacewalks that totaled more than 31 hours. Born in Idaho, Dr. Reilly received his master’s of science degree and doctorate in Dallas from the University of Texas. From 1980 until his selection for the astronaut program in 1994, he was employed as an oil and gas exploration geologist. During this time he was actively involved in the application of new imaging technology for industrial applications in deep water engineering projects and biological research. As part of his professional responsibilities, Dr. Reilly spent 22 days in deep submergence vehicles operated by the Harbor Branch Oceanographic Institution and the United States Navy. As an astronaut, Dr. Reilly flew on three shuttle missions. His 2007 flight on the Atlantis was the 118th shuttle mission and the 21st mission to visit the International Space Station. The successful construction and repair mission involved five astronauts with Dr. Reilly and accumulated nearly 15 hours during two spacewalks. The mission returned after traveling 5.8 million miles in fourteen days.
Mr. Reilly's slide presentation can be accessed here.
Dr. Reilly: You might be asking the question, “What is an astronaut doing here with boiler inspectors?” Surprisingly enough, a lot of what astronauts do involves pressure vessels. The theme of your conference, “Safety, a Commitment for Life,” has a double meaning, because safety to us is all about margins. And we are going to talk about how you increase those margins, ways to do it, and what techniques to use. We are going to talk about continuous improvement, which is something that NASA grew up doing, so it became part of our DNA. When I went into my consulting job working with companies like General Mills, Coca-Cola, White Wave, and others, I found they are also constantly looking for ways to improve their processes. Five cents on a box of Cheerios is a big deal in terms of the percentages that they can recover. So they are always looking for ways to make things better, cheaper, faster, safer. Safer is a big piece of what they are concerned about.
At NASA, margins were all about being in space and understanding what our margins were. We had very narrow margins so we had to manage those margins very, very well. How do we increase those margins? It's all about the people, teamwork, and integrity.
The very first thing I'm going to do is to put up the big picture. We are an ordinary planet around an ordinary star in the suburbs of a fairly ordinary galaxy. This is the Andromeda Galaxy, and up there in the upper left is where Edwin Hubble, back in 1923, saw seven variable stars, which are stars that have an intrinsic brightness, so they flash at the same intensity no matter where they are. And so with distance, the 1-over-r-squared (1/r2) relationship expressed how far away the galaxy was, and Hubble safely demonstrated that its proximity to us was nowhere near what people thought.
That was the beginning of the cosmology that we understand today, and that was in 1923. There are people who are still alive today who were alive in 1923. Everything we know about the cosmos happened in the span of basically one lifetime. All of the space activities we have been pursuing have been in one lifetime. My grandmother, who came from Ireland to the United States in 1922, one year before Edwin Hubble, took the fastest mode of transportation of the day, which wasn't the Sultana, but it was a ship. She crossed the North Atlantic in five days and arrived in New York and lived there her entire life. But she did live long enough to see her grandson repeat that journey going the other direction in 18 minutes. And so it's amazing how far we have come in one lifetime. We try to get that point across to kids, and it's one of my big passions.
A little about my background: surprisingly, no matter how far I tried to get away from engineering, it kept coming back into my life. I started out in aerospace when I first went to school and found math was challenging, and decided I didn't really need math to be a fighter pilot. So I went into sciences fully intending to be a fighter pilot, test pilot, and astronaut. Well, that didn't work initially, so I went to work in the oil business, and then I started getting engineering assignments. I have had engineering assignments most of my life. I started in the oil industry, looking at deep-water materials, but also when we were out flying in space, we had to do other jobs, and most of my jobs were engineering on the space station. My last job was working as part of the design team for the Orion, which was the last time I did a test flight.
[Video presentation] Let's go into space together. I'm going to take you on the shuttle, and we are going to watch. At this point we have been on our backs for about two hours. Three of the engines start at six seconds before launch. Now we can feel this rumble come through the spacecraft. You can see a plume shoot out from behind us. And then we go through the countdown to zero. Once the solid rocket booster is lighted at zero, the force kicks you back in your chair and you're starting to sink in. We are off that pad; we are already doing about 120 miles per hour going straight up. As you can see from the sonic shocks inside the solid boosters, it's a pretty rough ride for the first two-and-a-half minutes. The sun you see coming up across the Atlantic is going to set in 45 minutes for you, so one of your first big challenges is your perspective. Right here at one minute, we are going supersonic, and that's a three-and-a-half million pound machine going supersonic in one minute, going straight up.
We are at two minutes and seventeen seconds, and we are going to burn the propellant out of those two external boosters that you see on either side. We are going over 550,000 feet, five times the speed of sound. And then there is going to be a big flash and bang, and away the solid boosters go. And for the next six-and-a-half minutes, we are going to burn the hydrogen and oxygen out of the big orange external tanks attached to Atlantis' belly right there. This is a view from our camera that looks down that tank, and you are looking right at the underside of the shuttle.
Just offshore Virginia at about Mach 16, we're rolling the heads up so we can now talk through geostationary communication satellites back at Houston. Eight-and-a-half minutes, we are just offshore New York, 800 miles down range, 122 miles in altitude, doing 17,500 miles per hour. And now we turn our rocketship into a spacecraft, and that's going to be our home away from home for the next two weeks while we are in space.
Now, how do we make all of this stuff work? What keeps us safe? It's the same types of things that you probably see in your work.
What keeps us safe: very deterministic processes that are controlled by procedures, and oftentimes, the tools we use to help drive those things; the concept of continuous improvement also keeps us safe. But the most important aspect of all of this is the people: the people that we work with, the people that we train with, and the people that you, no kidding, literally trust your life with whether they are flying with you or if they are on the ground. And integrity is a key part of all of this.
But first let's step back a little bit. When I was thinking about coming and talking to you, I realized I have actually spent a fair bit of my life inside pressure vessels. And here are a couple of them. At the upper right is Johnson-Sea-Link. It's literally a five-inch-thick Plexiglas bubble. I'm sitting in that bubble on the left. We took it down at 3,000 feet of water depth with about 100 atmospheric pressure on the outside of the sphere. And it's interesting because our seat is actually set on a floating structure inside the sphere, and as you are descending in the water, the sphere shrinks, and your seat pops as you are descending the water column.
Passing about 600 feet the light starts to disappear and it starts to get your attention. You can't really see anything else, but you can sure hear that popping as you are going down, and you’re hoping it's the seat and not the hull. We were working in a dark environment at 3,000 feet, and we found life that lives on things we didn't expect. I did my dissertation on these communities that live on the oil and gas in the Gulf of Mexico. The other vessel you see is the nuclear research submarine that was operated by the U.S. Navy. It's now been decommissioned. It's called the NO-1, and it was a great tool. We were able to take it down and stay in the water for several days at a time.
In the oil industry, I worked on these kinds of rigs pictured in the next slide. The one I worked on is shown in the upper left. Fortunately, I was not on the one on the lower right. That's the Deepwater Horizon operated by BP, and of course, you know how that turned out. It's an interesting case study to look at because they had just about every kind of failure you can have in materials, processes, procedures, and management; and the biggest failure was communications in a vertical sense right there on the rig and also to the people back in the Houston office: all of the things that we fought against every day in NASA. But of course we had our own problems. In fact, one of the big failures in our Columbia accident turned out to be a vertical communication failure.
Here are the other pressure vessels I spent a lot of time in. There are actually 14 of them: the shuttle is one; the Soyuz is one; the ATV, which is the transfer vehicle operated by the Europeans, is another; plus the 10 other modules on the space station -- that gives you 13. Where is the fourteenth one?
Meeting Attendees: Space suit.
Dr. Reilly: That’s right. In fact, if anything bad happens in that suit, it's pretty much a bad day for me. So we have a lot of trust in the people on the ground when they say they are going to test it. And they know all about record keeping, like Nathaniel Gee explained in his presentation when he spoke about how the dam was built. We know where every piece of metal in that suit came from all the way back to the mine that it came from. We have a complete history on all the materials the suit was built out of. And the people that maintain it know exactly how it is going to respond and know that I'm going to be safe in it, which is what I put my trust in, because obviously we don't have time to do that ourselves. We put a lot of trust in people. And it's all about integrity.
A little story: I got sworn in as the Seventh Honorary U.S. Marshal because I have a great, great, great-grandfather who was a Deputy U.S. Marshal back in the Western District of Arkansas. I won't go into detail other than to say that in the law enforcement realm, a fairly common denominator is integrity. In other words, you do what you say you are going to do, and people can trust that you are going to get it done. There are no excuses. You are either going to do it or you are not going to do it, and that's going to be the deciding factor.
One more brief story related to this. Your keynote speaker today was well-known actor James Caan, who I’ve never met, but I did get the chance to meet another actor. One time I was in Scotland giving a presentation, and I hear a very distinct voice next to me. I turned around and there was a guy with a gray beard and bald head. His name is Sean Connery. You may have heard of him.
Now, Sean was standing there, and I turned around and said hello, and he greeted me back. And I said, “You know, we actually share some commonality in our history.” He said, “Really, what is that?”
I said, “Well, you did a movie called Outland where you were a marshal in space.” He said, “Oh, yes, it was a very good movie, I had a very good time with that.” To which I replied, “Well, as a matter of fact, I actually happen to be a U.S. Marshal in space.”
And without missing a beat, he looked at me and he said, “Oh, yes, how interesting. But I bet you have never been James Bond.” That was a highlight for me, but he will always beat me.
Let’s turn our attention to the idea of continuous improvement and the recognition that the margin between success and failure is very narrow, for us in space particularly. Margins can get smaller over time. That was something we always had to battle in the space shuttle because it was a fairly long-lived spacecraft. As already mentioned, it had seen 118 missions by the time we flew my last one. So we were always looking for ways to improve those margins wherever we possibly could. And one of the ways we did it was test to train to maximum efficiencies, and then it's all about the details. You have to sweat the details. And it's not one of those things you can leave for later. You have to do it now. The details you missed can kill you, like the details on the Columbia. They should have sweat those details. They would have made different decisions and perhaps not have lost that weld nor killed 11 people as part of that process.
Question everything. As Nathaniel Gee presented, people were questioning that if something happened over there at that dam, what about ours? What's happening to our penstocks? Do we have to worry about that? We question like that as well. If something happens on a spacecraft, we question what it means. And when we didn't do that, we had situations like Challenger where we saw blowback or a burn-through on some of the O-rings on the solid rocket booster that should never have happened, but since it did happen and it didn't fail the first time, people got complacent.
Complacency will kill you. You cannot accept out of hand, abnormal performance on any of the systems. So question everything. Act like everyone has your life or your career in their hands. And we, of course, had to do it that way. Everyone deliberately did have our lives in their hands, and they took it very seriously.
At NASA we lived this concept: plan, brief, execute, debrief, and replan. That was how we approached everything, from a macro level to missions to programs all the way down to the micro, and that was all the way down to the tasks.
Communicate objectives and expectations. It's critical that you communicate both vertically and horizontally and make sure that your communication channels are open. What happened on Columbia was that the communication channels vertically were closed because the program manager was insisting that you had to prove something wasn't safe. How do you do that? It's not possible. You have to prove it is safe, because your margins are very narrow. You don't have any room to accept failure in any level in any of your systems.
Colonel John Boyd was an Air Force fighter pilot, and he came up with the OODA Loop, which is “observe, orient, decide, and act” loop to help fighter pilots train and figure out how to get faster than their adversary. Our adversary is failure, and something we try to do is observe the problem, orient the problem, and try to work on a fix, decide what the fix will be, and then do it and try to get inside the failure mechanism itself in terms of making it around that loop.
If you get inside the failure loop, then you can keep a bad thing from happening. That's what we are looking for. I mentioned that we train repeatedly. Our spacewalk training is performed in the water. We spend about seven hours for every hour we do a spacewalk. So if we are going to do an eight-hour spacewalk, we will spend somewhere close to 55-56 hours in the water before we go out and do that spacewalk. It's a big, expensive overhead to get this training done.
While we are training, we look at every single task we perform while in the water and find ways we can make it more efficient; look for where we can increase our margins. And our margins are safety, efficiency, and time. And the last point is really critical for us, because if we can buy 10 seconds, that's 10 seconds you can be a tourist. Our day in space is actually 16 hours long. We get up at eight in the morning and go to bed at eight at night, and it's booked at 15-minute intervals all the way through.
My first space flight was interesting from the standpoint that we flew to the Russian Space Station Mir, and saw engineering projects from a Russian perspective versus a U.S. perspective, which was quite an eye-opening experience. They literally built their spacecrafts for war. Their hatches and everything were huge and heavy and they did things that we would never do, but they had lots and lots of margin on them. But they had some other things too. Their communication was not very good. This is one of the big challenges. It turns out that when we merged our two programs -- they had grown up independently -- we had to figure out how we were going to operate it together to do the international space station.
The crew got together and we became friends very quickly. Our communication in space was actually very good. The challenge was how we worked with the ground. Everybody you see in the red, white, and blue shirts are Americans on the crew, with the exception of Salizhan Sharipov. He flew with us as a Russian crew member on the shuttle mission to the Russian space station as his very first flight in space, which had to be terrifically ironic for him. Anatoly Solovyev is in the picture, too. But everybody in the red, white, and blue shirts is the American crew, and we worked very collaboratively with our folks on the ground. If we had any questions, we would call them up and have a discussion about it, come up with a plan, and we would execute it. The Russian ground crew operated differently. They were very directive. They would say, “This is what you are going to do.”
I was a rookie on the flight, and one of the things that astronaut Terry Wilcutt had to do was give the rookies jobs. Mine was inventory management on the shuttle and the space station. That was taking 9,000 pounds of gear and hauling it from one spacecraft to another in seven days -- an interesting challenge.
I worked with Anatoly directly every morning, and we actually got along very well. I would go over and discuss what we were planning on doing, and we usually agreed. Then one day he's looking at his binder that came up from the ground, and it's about six feet of paper, and I'm looking at it, and while I'm talking to him, Anatoly is tearing out two or three inch strips of paper and clipping them on the bulkhead, and then he'd take about a foot of this paper, wad it up, and throw it away.
I was curious. “Anatoly, that's the daily plan from Moscow; right?” He said, “That's right.” I said, “What is that you’re putting over here?” He said, “Well, that's what we’re going to do today.” I then asked, “Well, what's that you’re throwing away?” And he said, “That's what we are going to tell them we did today.” You can see, they are disconnected.
In fact, I won a bottle of vodka for the space station because they were firmly convinced they knew where everything was because they kept really good records on the ground. Unfortunately, because communication was so poor, those records were worthless. The job that I had on that flight was to come up with an inventory management system with a team of folks and sell it to the Russians, and we had to do that together with the folks with mission control in Moscow. They were firmly convinced they knew where everything was, so I bet them the best vodka they could find that they wouldn't be able to tell where all the inventory was inside the Mir. And just before the orbit, they went all the way from one end to the other, tested the system we built, and they found inventory they didn't know was up there. The most telling example was when they opened one of the last panels that hadn't been opened in a very long time, if at all, in space, and when they opened it up, out floated a six-pack of Bolshevik beer. Nobody ever admitted what happened to that six-pack of beer, but it did win me a bottle of vodka and it changed the perspective of how we worked together.
Terry Wilcutt had a very interesting way of challenging people that he worked with. He had what we called a three-question management style. In fact, I just published a paper on this recently with a retired FBI counter-intelligence specialist. Terry would ask three things. “Do you have a plan?” And if I said yes, he really didn't need to know many more details. He would ask, “Is it working?” And as long as I could say “yes, it appears to be working,” that was fine. If I had a problem, I would detail the problem. And then, “Are you ahead or behind?” And as long as I could say we were on track or ahead, we were just fine.
He might ask a few more questions for more details. And then the fourth question was always, “What would you do differently?” And the answer to that question was the information we passed down to the crews behind us – what we might change in our daily events for the next day. So it was plan, brief, execute, and debrief.
This is what it looks like inside the spacecraft that we rendezvous with Russian Space Station Mir. The guy with his legs sort of back is Harry Seagrave, and that's me, and then to the left of me behind Salizhan Sharipov is Joe Edwards' crew. He and I are actually flying the spacecraft and we are doing this as two spacecrafts just feet apart going 17,500 miles an hour. And you'd think Terry, because he's the boss and he's pretty much signed off on ownership of this spacecraft, might be a little concerned about all of us, but you notice he's really not paying too much attention to what's going on behind him at all. He's listening for problems, but as long as everything is going fine, he's not getting involved in it at all. He is letting us deal with it.
My second crew was a bunch of overachievers, and there were five of us on this flight. This was the first spacewalk flight for me, and I did three of them on this one. My job was to put the airlock on board. We met up with three astronauts at the space station: Jerry, Susan, and Jim. They had been up there for three months. We actually stole this ball cap from one of our flight crew. When we sent this picture to the ground team, it was the first time they saw the cap. It had been missing for three months. We wanted to carry something up to space that represented those 250 folks on the ground who were directly assigned to our mission. They worked three shifts 24-hours a day for the entire time we were up in orbit.
Our job was to put up that airlock on the left. It weighed about 18,000 pounds. And then on the upper right you see a high-pressure gas tank weighing about 500 pounds. And the arm is being driven by Susan inside the space station, and she can't see us at all. She's listening to what we are saying and the digitals that she has on the arm to understand where that big piece of equipment is. That's me in the upper right. In fact, I'm only about a few inches away from that high-pressure gas tank. If anything happened to that thing and it smacked me, I would be a separate satellite, which would also be a really bad day in space for me. But fortunately, a smack like that didn't happen because one of the things we did is spend a lot of time talking and working out and training with the folks at the space station even though they were in space and we were on the ground. It was the ultimate in remote training; remote education. We all recognized up there that our lives depended on the team in space and the people on the ground.
We try to limit the risks everywhere we can, but there is a certain amount of risk that you have to accept, and this represents one of them. This is called micrometeoroid and orbital debris MMOD. The Air Force tracks about 23,000 items in space that we can see, and there are millions of things that we can't see, and this is one of them. It looks like a bullet hole. It's about the size of my thumb, probably a result of a high-velocity impact. It went through a radiator on the space station, and there are three more out on the end of it, 45 feet to the right where you see Rick Mastracchio, which is the area that I was in three months before. He didn't even see this when he was outside working; he didn't notice it until we saw the image later when he came back down.
There are actually a few things wrong with the movie Gravity. One of which is Sandra Bullock’s hair, which was always perfect. You will see Suni Williams in the next slide. Hair is never perfect in space. The other thing about being in space: when the debris starts flying at you, you can't see it. It's traveling too fast. Your eyes can't register it. Also in the movie, when she looked up and saw something coming at her, music of doom was splaying. There is no music of doom in space either, fortunately. I'd probably get nervous about that.
This is a picture of my last crew, STS-117. This was my most challenging flight because we had a lot of failures on board. There is Suni with her hair. She actually spent 194 days in space, so she was ahead of the curve. Our job was to install the S3/S4 truss on board the space station. That's what you see on the left side. And this weighed about 38,000 pounds.
We were faced with all kinds of unanticipated challenges. We actually had a voltage spike go through the system. We are not really sure how that happened. We picked up a charge, I think, on the structure, this S3/S4 truss, before we attached it. And one of the things that is critical about assembling a mission or assembling a spacecraft like this is that many of the pieces will not see each other on the ground, so we do a lot of testing before we ever get it there by testing against other pieces of equipment that mimic the space station equipment that's already there. It was very successful in most cases.
One of the things that bit us a few times, and which may have this time as well, was doing software testing. In fact, one of the things that software engineers were fond of trying to do was something called test by analysis. That was one of those key phrases that when I heard it, I wanted to shoot the person who was uttering that phrase. Because it meant that it wasn't tested against a mate or a mutual end piece that would mimic what we were going to see. What it would do is evaluate the software and basically do a walk-through. And the consequence is that's where we usually found our failures, where we did not test properly. So it's one of the things we had to continuously do.
Next, you will notice this little piece right here bolted up on an external part of the space shuttle. This is a key component coming back home. That's our thermal protection system. I mentioned we have 16-hour days, so we can't do anything about it, but we did have a team of folks on the ground. In fact, where that is located on the left is on the tail of the orbiter right underneath the propellant tanks where our orbital maneuvering system is. Having heat go through the column into that compartment back there would again be one of those really bad days for us. Our folks on the ground came up with processes and procedures, and Heidi right there is one of our astronauts, and this is Dieter, one of our engineers, working on it, and we have this team of about 12 other engineers that were working on this literally overnight for four nights to get ready for us to go out and make the repair. This is what it looked like after we took their procedures and went out and did it, and it worked flawlessly. Otherwise, I probably wouldn't be here talking to you today. So they did a beautiful job. And of course, we trusted them. They sweated the details for four nights getting everything ready for us, and we just went out and executed the work.
[Video] This is a shot from Atlantis on our way home. We are out ahead of that inside this plasma sheet, and that's what you see outside overhead of this area of the flight deck. All the flashes and the light that you see out there is the atmosphere. It's quite hot; it's about 2,700°F. As the camera pans around, you can see the white hot air as it hits the front windows. We come overhead; we are still supersonic at this point. We are just a little bit over Mach one. We are all the way through that plasma, the 2,700°F. And here is what I wanted to mention about margins. The 2,700°F is a critical component. We have to stay within 10 percent of that number, especially on the upper side. Because if it gets to 3,100°F, all of that leading edge you see there, the carbon-carbon composite that also comprises the nose cap, auto-ignites, and if we go from 2,700°F to 3,100°F, it's pretty much a bad day in space. We don't want to do that. That was why repairing that piece of damage to the right side was so critical for us and why we always look at our margins, that it's only 10 percent, very small. If it varies, it's usually a bad day.
When we finally come in and do our big right-hand turn, you see us lining up with the runway. We are coming about six times steeper than a conventional airplane at this point. And we are crossing through 11,000 feet on the right, going about 300 knots and an 18-degree wide slope. We are pretty much coming down like a piano at this point. And at about 1,200 feet, we are going to pull the nose up, and at this point we are sort of flying pretty much like a conventional airplane. We are going to touch down on the runway, about 2,500 feet there, and roll to a stop at about 10,000 feet on the runway. Right here we are slowing down from 300 knots to about 195 knots when we cross the end of the runway. We will touch down somewhere between 190, 195. And then immediately after you see that puff of smoke, we will pop a drag chute. And then that's where we picked up 5.8 million miles on that mission. We roll to a stop. I’ve got about 18 million miles to my credit. Unfortunately, not a single one of them is a frequent flyer mile.
In summary, we had three different teams, all of them successful, even though we were faced with a lot of challenges, both cultural and just the number of people involved. Mission is critical. Everybody has a role required for success. And integrity is the piece that you have to have. It's all part of the absolute trust of everybody on the team; not only those in orbit, but also the folks on the ground. And when that communication breaks down, like we saw in the Russian case, that's where things go divergent and that's the last thing you want to happen on missions like that.
Train like you fly, fly like you train. It's all about sweating the details over and over and over again, look for that little bitty improvement. It might not seem like much, but when you add it all up together, that's where you buy an extra 10 seconds, that extra margin of safety. Those are the things we cared about.
Continuous improvement is all about training and developing the processes and procedures. And more critically are the people. Question everything. If something doesn't make sense or you just have that funny feeling in the back of your head, start asking the questions. Something was bothering us about the flight right before Columbia, and we didn't question everything, or not hard enough. And as a consequence we were a little relaxed and had a negative outcome. Check details if you ever get relaxed.
Continuous improvement is a lifestyle. It's something that we grew up in at NASA, but it's something that we teach when I work with other companies, and particularly, with the military and law enforcement. It's all about finding where your vulnerability is and continuously improving those processes and buying yourself a little extra margin.
And, of course, everything rests on communication. I want to thank you very much for having the opportunity to be here today. Know your team, know your mission, work together for success, protect your margins, and more critically, enjoy the ride.