Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not considering such a basic error condition seems like a gross omission.

This can't even come from the software engineering, but must be some kind of managerial failure (e.g. we're short on time, but have to report great progress to my boss, so skip this scenario).



This is the software engineering version of "I am very badass", passing judgement on software at the cutting edge of science, while sitting at home writing React code.


In football journalism they call that Monday Morning Quarterbacking.


In any kind of aggressive manly kind of story, especially involving the military, they're "Keyboard Cowboys".

Or some new ones I've heard kicked around:

- Gravy Seals - 101st Chairborn - Chair Force pilots

...And so on...


You're mixing up terms here. Gravy seals is an insult term, used for typically out of shape people who are heavy into gun/militia/i-am-very-badass type culture.

The latter two are just ribbing jokes about the Air Force, from the other branches usually. My old (Army) boss used to tell me to 'take off your air force gloves' if he ever saw me with my hands in my pockets.


Not having served (but I did help out on Operation Code for a couple years):

"Chair Force" does sound like something they would say about armchair generals. No?


Absolutely. Just pointing out that one is strictly a pejorative, while the others would likely be viewed as 'someone in the airforce.' I think OP was wanting things more of the former, like gravy seals, meal team six, y'all queda, etc.


I like y'all queda!


I am a React developer and I am in this picture and I don't like it


> seems like such a gross omission.

Almost all space mission code only ever has the so-called 'happy path'. We rely on extremely tight mechanical and aerospace engineering tolerances to achieve that happy path.

The Hubble Space Telescope's primary mirror grinding was off by a matter of micrometres, and resulted in blurry images.

Consider all the Mars rovers. Imagine some wind gust threw the descending module off course, or a retro-rocket failed because of vibrations.

Writing code for space missions isn't like writing a CRUD app. Developers can't just teleport to a space probe millions to billions of kilometres away to rectify errors and debug running code on the fly.

For the record, the 'failure path' for Apollo 11 was to get the US President to announce to the world that the two astronauts would likely be marooned on the Moon. Apollo 13 very nearly failed, too.


Apollo 1 landed on the "failure path". Neil Armstrong noticed that the target was a boulder field, took manual control, discovered his new spot had a crater, and finally found level ground. They had low fuel but not dangerously low.

If Apollo 11 had followed the "happy path", they would have crashed and died.

Hubble was also the "failure path". The main mirror was flawed and had to be corrected.


If you write the failure path, then you also have to ensure it doesn't fire when there wasn't really a failure and do unnecessary heroics.

Same reason you can't program a self driving car to save a person by sacrificing a squirrel. It's just going to run over the squirrel when it didn't need to.


When saying "we", do you mean you write code for the space missions?

Writing only happy path code as a standard practice in the space sector seems quite absurd. You won't ever achieve absolute precision and errors do happen, yet it seems like systems recover most of the time.

Recently, the antenna of Voyager 2 got misaligned, but it is expected to recover from that. That was only the last problem it encountered over its very long mission - and it managed to recover from all of those so far!


Voyager 2 is already recovered, they waited until it was at the best possible (but still wrong orientation) and just yelled at it so that it heard, even with it being misaligned.


Voyager's programming just brings joy to me to think about.. I mean it's the system as a whole (of course) .. but the fact that they've been capable of flying through the environments they have been, using points of light to align themselves, among other things.. for decades .. and recover from incidents.. is just something I marvel at.


Wow I can't believe I didn't hear about this. It was all over the news when they broke it, so I figured it would be just as widely reported when they fixed it. It's been almost 3 weeks.


There was definitely a prominent NYTimes story when it was fixed, that's how I heard about it.


There was a NASA project to start developing flight software that's smarter in this kind of way, the Remote Agent. It got an award after flying, but if they continued that line of research I haven't heard about it. https://ntrs.nasa.gov/api/citations/20000116204/downloads/20...


Trying to hand roll a super robust AI software usually backfires. Emergency mode triggers right in the middle of the happy path and ruins your uh, a day if you're lucky. They know that, even I kinda know that.


That can't still be the case these days, can it? Extremely tight mechanical and engineering tolerances are very expensive compared with merely 'very very' tight tolerances, and I'd imagine the difference between the two can be bridged with more intelligent software in place of "gyroscope + clock + maybe PID loop"?


Yeah, this reads like two people defining happy path subtly differently: one is saying the happy path is anything within acceptable strictly defined mission parameters and tolerances, the other thinks it is the sequence of steps that is expected to successfully execute the mission without ever encountering an exceptional situation (which is the conventional software view of the term), but there are exceptional situations which may be covered in the specification of the mission and so “on the happy path”.

In the case the system strays outside mission success parameters then aborting could make sense. The question there looks to be if the success parameters were defined too narrowly - it sounds like an error in specification that prioritizes landing in the required area over the possibility of landing at all.


The classic hardware engineering response of "we'll just fix it in software". Turns out fixing things in software is even more expensive because it's just so easy to make changes that a combinatoric number of changes sneak in.


Ohhhh yes. I've been on the receiving end of this one. Designing a system which can accommodate higher tolerances on some hardware components through software controls is one thing. Being handed a poorly performing piece of hardware and told to "just fix it in software" is quite another.


Writing code for something that flies into space is not nearly as easy as you think it is. Perhaps the next time you write a comment you could first develop the software which you're complaining about first. I'm sure it would be a trivial task for someone of your stature :)


I'm not saying it is easy to handle. But this mode of failure could be expected and prepared for. It's not that uncommon that the spacecraft finds itself in a position which was not calculated.

Or, are you saying that it's expected that the mission did not count with this scenario, and that future missions don't need to account for that either?


You are saying "gross omission" like this is some Python script, like they are skipping the else clause for a condition. Imagine trying to land a plane that is flying at Mach 2, with no direct control, only a video feed with 4 seconds resolution, a bunch of sensors and a tank of fuel for retrograde burn to slow you down. Can you even fathom the number of scenarios that can happen. Your application may have 1 happy path and 2 sad path. Here you get only 1 happy path, a few not so happy path where your probe land sideway or just roll down a crater; and the rest of them are every other combinations of your probe's orientation and speed vector and collision location.

Hell, you can run a few thousand simulators for every scenario you can think of during descent, including lost of burner, propellant leak, etc, and then during the actual descent a chip get burnt because of a stray cosmic ray. There will still be somebody on HN call you out for cutting corner.


No that’s a bit unfair. ISRO, Israel, and Japan all had reasons for their failures that were mainly technical


Out of curiosity, how many space programs have you been involved with?


It probably wasn’t even the error. It could have been an accumulation of error % on some unbounded input.


That's probably why they haven't officially straight-up announced the issue.

This wouldn't be the first time that a mission failed due to embarrassing failures in basic software practices (eg Starliner's initial software bugs emerging from a lack of integrated testing).

Main difference is that you aren't triggering a billion overly sensitive nationalistic folks when you point out similar embarrassing errors in most other countries' programs. Eg the time NASA lost a probe due to miscommunicated units, the Apollo 1 disaster, the space shuttle disasters, or the tape around the wiring in Starliner, which was intended to be fire retardant actually turning out to be flammable...

Hell, Japan's Hakuto-R also failed because the software's error detection was buggy, and they openly admitted as much without any bluster about how no one but other people with experience writing code for space probes can criticize them.


> That's probably why they haven't officially straight-up announced the issue.

What do you mean by "they haven't officially straight-up announced the issue." ? They did so - several times actually.


Edit:nvm, I'm wrong, see below

They've given out vague explanations such as a software glitch, while holding the detailed post-mortem back claiming the obviously absurd excuse of national security concerns.

This is counter to how they typically operate as well as how most other agencies/companies around the world operate these days, where they at least explain what went wrong. eg Hakuto-R's team explaining that their flight software thought the radar altimeter was malfunctioning when it wasn't, causing it to rely on the IMU and thus it thought the surface was much higher than it actually was.


Might want to update your general knowledge. The ISRO Chief explained this in an interview. It wasn't just passed off as "software glitch" with no explanation.

Chairman S Somanath has given three main reasons that led to the crash-landing of the Vikram lander on September 6, 2019 just minutes before the touchdown.

The ISRO chairman said, “The primary issues were: One, we had five engines which were used to reduce the velocity (called retardation). These engines developed higher thrust. When such a higher thrust was happening, the errors on account of this differential were accumulated over some period. All the errors accumulated, which was slightly higher than what we expected.

When it (lander) started to turn very fast, its ability to turn was limited by the software because we never expected such high rates to come. This was the second issue.

The third reason for failure was the small site of 500m x 500m for landing of the lander.”

Rectifying those mistakes this time, the Isro chairman said, “This time we have kept an area of 4.2 km (along the track) x 2.5 km (width) for the landing site. So, it can land anywhere, so it doesn’t limit you to target a specific point.”

Somanath said “instead of a success-based design, Isro has this time opted for a failure-based design” and focused on what all can fail and how to protect it and ensure a successful landing.

“We looked at sensor failure, engine failure, algorithm failure, calculation failure. So, there are different failure scenarios calculated and programmed inside. We did new test beds for simulation, which was not there last time. This was to look at various failure scenarios,” he explained.

The ISRO chief said the Vikram now has additional solar panels on other surfaces to ensure that it generates power no matter how it lands.


Huh, I missed that, thanks for the info!


There was a JWST doc on netflix; that explained a NASA technique for Points of failure strategy. ISRO may be using similar.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: