Better yet, why do my codecs have access to *anything* on my computer other than...

cube13 · on April 11, 2011

I'm going to only cover the desktop/laptop space. This approach actually makes a lot of sense in the smartphone/tablet market, especially with all the multi-core chips coming out.

It's true that the bare minimum that a codec needs to do is take an input bytestream and output it to the various output(video and audio) bytestreams. So, for a hardware accelerated video, this is pretty simple-all you need to do is pipe your input stream to the hardware, and that will handle all the magic of getting it directly out to the screens and speakers. However, for non-native hardware accelerated codecs(i.e. not H.264 or MPEG-2), this process involves a lot more hardware and software to actually get something to the screen.

The non-native hardware accelerated codecs need access to(at a bare minimum) the sound and video system APIs. Unless the codec developers are certifiably insane, the codec will be using DirectX(for Windows) or OpenGL(for everything else) to handle the actual writing to the screen, especially if you want to use the video hardware for faster decoding. While it's possible for browsers to wrap those API calls, it just makes a lot more effort on all developers involved, and doesn't really get you much security at the expense of both complexity and performance. So it makes sense for any codec to have access to the lower level APIs for displaying things to the screen.

Sandboxing the process should be possible for the file inputs, but I'm not sure if it's really possible for file outputs in the cases where the codec is using a system-level API.

EDIT: After reading what I wrote, this is a good argument for the browsers themselves wrapping the codecs and placing them into a sandbox. Unfortunately, that will never happen, because of the legal minefield that almost all have right now...

nitrogen · on April 12, 2011

Actually, the codecs themselves usually aren't responsible for displaying to the screen via DirectX, OpenGL, etc. In the case of <video>, the browser needs to be able to mix the video into the web page, which means the browser needs the video data in its own memory, or as a pixmap/texture in video card memory.

But, most codecs are going to be written in C and assembly language for speed, which brings the potential for buffer overflows and other low-level exploits. Plus, that video data does eventually make it to the kernel and then the video hardware (often via a separate overlay interface like DirectShow, Xv, or VDPAU, though that is probably not the case with web browsers), so a vulnerability at any point along the chain is a serious issue.