I agree in principle. But I've got to get stuff done in the real world. I don't have an extra week to read the 22,554 words of JUST the video/audio spec in HTML5, and try to figure out not just how all the browser versions differ from each other, but also how they differ from the spec:
Sometimes user-agent sniffing is the only remotely sane thing you can do. Blame the browsers, not the programmers who have to work around their "idiosyncracies".
I'm cheerfully assuming that you're using distilled documentation and not the literal spec. If there is, say, conflicting documentation from Mozilla and Google, then we certainly have a bigger problem.
You would assume rightly, of course. Unfortunately, Mozilla documentation, which is the best around, is sorely lacking on the details of the HTML5 audio implementation, like proper usage of the load() method, and when it's supposed to be called or not called and when. [1] And I don't really know what Google documentation you're talking about.
So that's the whole point. People who go on about "feature detection" and "follow the spec", and think it's that simple, really don't seem to understand what's actually going on.
Assume good faith.