This is one of my favorite papers. The core point that I think should get a lot more attention is this one:
> When manual take-over is needed there is likely to be something wrong with the process, so that unusual actions will be needed to control it, and one can argue that the operator needs to be more rather than less skilled, and less rather than more loaded, than average.
The "remaining" operational work once automation has done it's job is more complex and weirder and requires more knowledge and skill than the basic task. On top of that, computes can (and do!) get systems into states that humans wouldn't, so those skills may exceed the skills needed by manual operators. This is something that people who work on things like driver aids talk about a lot, but I don't see as much attention to in the systems observability field.
> One can therefore only expect the operator to monitor the computer's decisions at some meta-level, to decide whether the computer's decisions are 'acceptable', If the computer is being used to make the decisions because human judgement and intuitive reasoning are not adequate in this context, then which of the decisions is to be accepted ? The human monitor has been given an impossible task.
This is particularly visible with things like data integrity in a complex database schema. Above a trivial scale, things both change too fast for a human to monitor, and change in ways that don't make sense to humans. When a human sees an anomaly, who's to know if it's expected?
Still a great, thought-provoking, read after 37 years.
For example, one of the things that computers do a lot in cloud "control planes" is placement optimization. I've got this compute job, or data, or packet, where should I put it? Computers can solve these kinds of problems much faster than humans, and much better when the dimensionality goes up. If things aren't looking as rosy as usual, is it because the workload has shifted to being harder to pack, or because something's gone wrong with the automation.
These problems, with humans operating numerical optimization processes, are only going to get harder and more relevant as ML techniques become more ubiquitous. At least we need to be prepared for some very tricky post-hoc analysis of computer decisions. Years of work to understand a millisecond of decisions might not be unusual.
> When manual take-over is needed there is likely to be something wrong with the process, so that unusual actions will be needed to control it, and one can argue that the operator needs to be more rather than less skilled, and less rather than more loaded, than average.
The "remaining" operational work once automation has done it's job is more complex and weirder and requires more knowledge and skill than the basic task. On top of that, computes can (and do!) get systems into states that humans wouldn't, so those skills may exceed the skills needed by manual operators. This is something that people who work on things like driver aids talk about a lot, but I don't see as much attention to in the systems observability field.
> One can therefore only expect the operator to monitor the computer's decisions at some meta-level, to decide whether the computer's decisions are 'acceptable', If the computer is being used to make the decisions because human judgement and intuitive reasoning are not adequate in this context, then which of the decisions is to be accepted ? The human monitor has been given an impossible task.
This is particularly visible with things like data integrity in a complex database schema. Above a trivial scale, things both change too fast for a human to monitor, and change in ways that don't make sense to humans. When a human sees an anomaly, who's to know if it's expected?
Still a great, thought-provoking, read after 37 years.