Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You're getting hammered for this, but for some jobs you're absolutely right. If you can't handle being put on the spot for a simple interview question, how are you going to go when a production line is down and it's costing your client $100k/hr, and the foreman's standing looking over your shoulder wanting to know when it'll be working?

Not all jobs give you all the time in the world on a test server without interruptions.



All these "yeah but imagine this horror scenario" (sorry to pick your example) descriptions sound like badly managed situations. Even in total failure situations like that...the foreman shouldn't be looking over your shoulder. The first step is to take a deep breath, assess the overall situation and get the programmers into their most comfortable mode to fix it. That can very well mean letting them think for 1h before starting doing anything. You can ignore that 100k price tag because that just leads to "just do something" thinking. It's very likely manically hacking away will actually end up making the situation worse and costing more.

Also...these situations can be trained for to certain degrees. There's well established precedents in crisis management. If you write code for a customer with a 100k/h downtime risk you should have done a very thorough risk analysis and have mitigation plans for most scenarios. The unknown unknowns that can happen are obviously the hard cases but you should be able to justify that these are going to be expensive no matter what. But even for these cases bricolage is trainable to a certain degree.

If you know there's 100k/h costs of failure you should also very specifically mention the fact that you are looking for these stress coping skills in the job description (and as such the interviewee should be prepared).


In my experience, high-pressure "fix it now" scenarios are about leaning on what you know about the system, interpreting metrics and log messages, understanding how it's going wrong, and deploying a (usually very simple) fix. Sometimes just blindly hitting the "rollback" button as a first step fixes it, too.

Performing well in this kind of scenario is much less about algorithms and data structures, and much more about systems thinking and the level of your understanding of your system.


Both of you have points, but I absolutely agree with this. Outside of "Healthcare.gov goes live tomorrow and turns out it's completely broken" scenarios I've never seen something that would require a ground-up design and build of a system under emergency production time constraints.

System knowledge and knowing which one bolt to replace or turn is almost always more effective.

PS: Not to mention, if we're getting into algorithmic complexity levels of engineering on a monkey patch to a production system, then so many alarm bells are already ringing.


See my longer comment above. Last paragraph:

An interview situation is much more adversarial: They are looking out for reasons not to hire you, you are competing with others for the job and only one can get it! No such consideration on the job, unless the work environment is completely and ridiculously broken. When are you ever in a work situation where several people work on competing solutions and everybody else but the person who made the winning one is fired? At work you are working together, and to solve a problem, not to week you out of the pool.


If you've done your job properly in the first place you have sufficient testing and redundancy in place that the situation you're referring to is pretty much impossible.

If there's ever going to be a situation where something costs $100k/hour if it's offline then you have 5 test servers and 2 or 3 load-balanced production servers so a failure doesn't take you offline, and you've written procedures and processes to handle critical problems long before they happen.

You definitely don't try to hotfix things while the foreman complains. That will make things worse.


That's a terrible analogy. In a production crisis, you have full knowledge of and access to the system being diagnosed, have all your standard tools and information sources available to you, and are able to enlist your co-workers to aid you. Been there, done that and it's's nothing like a interview whiteboarding session.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: