Github does it wrong. This is the sort of situation that calls for a type-name-to act approach. It ensures that you're acting on the object you think you're acting on.
The developer was on automatic because he had just been working with another repo and the two repo names were similar, so the type to confirm didn't prevent the mistake. The blog post suggests other changes GitHub could make.
"If you make something idiot-proof, someone will just make a better idiot."
I think there's a literal psychological disease in American culture, where if you fuck up, then someone else was to blame for not preventing you from doing it. I think we just need less of that (not more of that) to force people to understand that they need to be considering the effects of their actions before they do them.
Accident investigations intentionally do not apply blame. Humans are fallible. We make mistakes. If one human makes a mistake, others will too.
The author of the httpie blog post does not apply blame. In fact, he goes out of his way to explain his error, then suggests changes that would have prevented his mistake, to hopefully save others from the same mistake.
So we can blame humans as you seem to want to do, or we can accept human behavior and design our systems to be more forgiving.
cdelsolar at the top of this thread wrote that "I get emails, every single day" and that "It literally drives me crazy."
So what should cdelsolar do? Blame all the humans accidentally deleting their accounts, or find a way to design around it?
> Accident investigations intentionally do not apply blame.
Yeah they do and pilot error is very common:
> During 2004 in the United States, pilot error was listed as the primary cause of 78.6% of fatal general aviation accidents, and as the primary cause of 75.5% of general aviation accidents overall.[27] For scheduled air transport, pilot error typically accounts for just over half of worldwide accidents with a known cause
They can try to address that through training, certification, recertification and updates to manuals and checklists, but at some point some level of human failure is going to be inevitable. And with UI/UX like this there isn't much ability to do that, since people don't get certified on using GitHub.
Meanwhile, it is going to be very difficult to break people who are stuck on "autopilot" and just aren't reading anything out of it vi any kinds of dialogs or messages. Gonna delete 8,000,000 stars on this project? Good. Oh wait. Whoops. I'd like to see if the change to the UX to show that information has made any actual measurable impact on the accidental deletion incidents that GH sees.
There isn't any good substitute for just learning early that if you are going to delete something you need to stop, think, read and wait a second before hitting that button. Trying to fashion a world where people can navigate it successfully on completely thoughtless autopilot all the time isn't possible.
Also, that blog post is irritating because "Lesson #1" should have been for the author to STOP AND THINK when deleting or presented with a scary modal dialog box that actions were going to be permanent. My reading of that blog post is that the author didn't actually accept any blame, didn't see any need to change any of their behavior and shifted it all onto GH. It is equivalent to a non-apology-apology.
Modern accident investigators avoid the words "pilot error", as the scope of their work is to determine the cause of an accident, rather than to apportion blame.
And from the NTSB:
The NTSB does not assign fault or blame for an accident or incident; rather, NTSB investigations are factfinding proceedings with no adverse parties and are not conducted for the purpose of determining the rights, liabilities, or blame of any person or entity."
> should have been for the author to STOP AND THINK when deleting
I've been doing system administration and programming since the early 90s. One of the first things I learned from a grey beard at the time was after typing a destructive command, take my hands away from the keyboard and read back what I just typed before pressing enter. It was a great lesson and has saved me from a lot of mistakes.
Guess what? I still occasionally make mistakes. Because I'm human.
So another thing I learned was to make any shells on production systems be visually distinct. All my production machines have a red shell prompt. That's saved me from even more mistakes than reading back commands. And it protects more than just me, but anyone else who logs into those machines. (Yes, ideally, no one is ever logging into a production machine.)
The problem with "STOP AND THINK" is that it requires every person to learn that lesson individually. Meanwhile, if you accept that human behavior is what it is and design around it, you can save many people from mistakes in the first place.
There's a saying: do you want to be right or do you want to win? Personally, I want to win (i.e. be effective). Winning means accepting humans are fallible and doing my best to design systems for them instead of holding them at fault in cases where a better design would have prevented the mistake.
> The probable cause of the crash was "the pilot's failure to maintain control of the airplane during a descent over water at night, which was a result of spatial disorientation".
At some point there's not much else to do. Pilot exceeded their training and flew into the water. You could try to prevent that with more and more autopilots and warnings, but at some point pilots have to be able to fly the thing on manual and need to have the actual training for the conditions. I doubt the modern NTSB would be able to say anything better.
The prompts that you have setup also won't stop you from doing a `sudo shutdown -r now` in production quickly one day on autopilot (and typing sudo doesn't help much at all because it is now so common and you probably do that on preprod as well). STOP AND THINK is still going to be the best tool in your toolbox.
And in addition to having been a system administration since the early 90s (and worked at Amazon for 5 years) I'm also a cave diver. There's a variety of different methods in cave/technical diving to avoid breathing the wrong gas at the wrong depth. The best one is probably just to simply label all your bottles with your MOD, always analyze your gas and have a gas analysis sticker on the tank that matches the MOD sticker, then to always go through gas switching procedure with your buddy and validate MOD and analysis sticker every time you do a gas switch (even if you think you're not carrying any gas that is outside of the range where you're diving that day). Alternatives like different colored regulators or carrying gas on different sides run into trouble with exceptional procedures when regulators fail or tanks wind up getting dropped and picked up and put on the wrong side. You can then wind up following "correct" gas switching procedure at the gas switch time and still die. The best solution anyone has come up with has just been to always verify gas contents before sticking it into your mouth -- stop and think every single time. Adding anything else is generally creating more of an overcomplication that causes more additional secondary problems than it solves.
> a better design would have prevented the mistake.
You still need to show that a "better" design actually does stop the mistakes. Does having the stars and forks listed in the message really do that much or do people who are autopilot not reading still just hit submit? Has anyone really measured that?
I do not share your unbridled optimism that the perfect UI/UX design is out there which solves all problems and has no unintended consequences, and one of the unintended consequences I think is social which is that we learn to expect that if we screwed up that someone else should have stopped us.
And I learned that early growing up on systems where deletion was largely permanent and people didn't have backups (although I did get paid some money in high school to manually undelete files on an Apple floppy disk that someone else had nuked by accident), so I learned to think through permanent actions early. And I didn't learn to blame someone else for my own failure. I leveraged that at Amazon and ran all kinds of nasty commands in prod on a daily basis and never destroyed the company (all of that was done in the service of standardizing the configuration management of the systems and in the long run reducing the amount that prod needed to be touched, but early Amazon had a chicken-and-egg problem with a massive production deployment and no good standardization).
Another good example is the Union Street I5 exit in Seattle:
There's nothing much more to be done about the UI/UX of that exit. People are already driving past signs tell them to slow down to 20mph and they're driving at freeway speeds straight at a large concrete wall with no sense of self preservation. You could argue that there should be flashing lights which turn on when people are coming in too hot, but the 11foot8 bridge channel shows just how useless that is (and in Seattle they dropped the speed limits down to 25 mph without any road diets and added those radar speed limit signs that everyone just ignores now, so don't bother suggesting something like that). They aren't going to be able to do anything about reengineering that exit because its stuck in the middle of Downtown Seattle and they'd need to demolish too many structures. IMO, the problem there is that people are actually too used to being able to take exits at highway speeds without slowing down and can't recognize an exit where they can't--better "UI/UX" everywhere else causes problems where you can't retrofit it in. Any accident analysis that goes beyond individual driver responsibility should just lead to lax testing and certification standards and no recertification at all, and the way we treat driving as a Right and not a privilege.
And going back to IT Operations, the problem that I have with blameless portmortems (or whatever less dramatic term is in favor these days) is that most often the blame should fall on management and that lets them off the hook. The standard form for dealing with post-incident analysis should include a field for what decisions of management and what business prioritization led to the incident. Yeah, we shouldn't be blaming engineers most of the time (although given something along the lines of a bell curve in competency, there have to exist operational engineers who should just be encouraged to find other jobs one way or another), but the way to stop doing that shouldn't necessarily to stop entirely, but to be a bit more accurate about why these failures are happening. Although to be fair, perfect operations or perfect security is neither desirable or attainable by any organization. But again that points to the fact that perfect UX/UI to avoid incidents is not going to be attainable (in general "perfect" anything is not attainable -- its up there with drug-free societies and worlds without any abortion).
We're taking past each other. I'm not saying we can make the world perfectly safe and that humans are never at fault.
Rather, that after a mistake or accident it's not helpful to blame. I'll just quote from OSHA:
"The prevention of another future incident is the purpose of incident investigation, not to lay blame or find who’s at fault. The investigation should identify the causes of the incident so that controls can be put in place to prevent the same incident from happening again. [...] For the incident investigation to be effective, management must have a plan in place for implementing the corrective actions and making system improvements."
You can blame those drivers for not reading the speed limit all day long, but that doesn't prevent future accidents. That off ramp is poorly designed. That it can't be fixed today for reasons is besides the point. There's still obvious lessons to take away from it when designing new off ramps. One obvious lesson is that signs are not an effective way to slow people down.
I've personally known people who strongly exhibit these symptoms. There is no accountability (and no "good faith" arguments) for such people and I notice a seemingly similar pattern of thought in the linked blog post. Some excerpts which seem to intentionally shift blame:
> It’s a peculiarity of GitHub, to put it mildly, that making a repo private permanently deletes all watchers and stars.
> GitHub’s conceptual model treats users and organizations as very similar entities when it comes to profiles and repos.
> The problem is that the box looks exactly the same for repos with no commits and stars and for repos with a decade-long history and 55k stargazers and watchers.
I could go on. Anyway, just trying to offer another voice in agreement. Also, that particular call-out of "psychological disease" is especially notable to me for personal reasons. (Not that I'm intending to diagnose this specific blogger; just that the tone of the blog post indeed seems to be blame-shifting, which is a characteristic behavior of those with NPD.)
All those stories about people suing for a million dollar settlement probably doesn't help. And the entire unscrupulous lawyer class that enables it (seriously don't they have a bar association in America?!).
A human can literally drive all the way home from work and not remember doing so. You are surprised such an animal can fill out a text field and click some buttons?
I'm a professional developer and all that, I recently accidentally formatted my phone (and screamed to my stupidity). In twrp there is a "wipe" section that's used to well, wipe the phone, or optionally wipe the cache. I went in sure it was to wipe the cache, it clearly stated it would erase stuff, i still did.