Is it really _that_ hard to create a generic video decoding DSP whose firmware could be updated? Most codecs are very similar to each other. IIRC Texas Instruments used multicore DSP to decode MPEG back in the 90s.
Or maybe we should have written codecs to be amenable towards GPU shader implementation...
The problem with a generic codec DSP is how fast do you make it? Newer codecs often require twice as much computation as older ones, so do you make the DSP twice as fast as you need today and hope that's enough to run future codecs? Meanwhile you're wasting transistors on a DSP that won't be fully used until years later.
To some extent the PS3 did this; the Cell SPEs were fast enough to implement Blu-ray and streaming video playback in software and they made several updates over the life of the PS3.
> Or maybe we should have written codecs to be amenable towards GPU shader implementation...
They are, but programmable GPU shaders are nearly always more power expensive than fixed function purpose specific silicon. It's why many key aspects of GPUs are still fixed function, in fact, including triangle rasteriziation and texture sampling/filtering.
I think, at least, that one of the biggest use-cases for encode is game streamers (is this right?), they should have decent dGPUs anyway, so their iGPU is just sitting there.
Elemental wrote a GPU shader h264 encoder for the Radeon 5870 back in the day, marketed towards broadcasters who needed quality and throughput: https://www.anandtech.com/show/2586
Intel used to write hybrid encoders (that used some fixed function and some iGPU shader) for their older iGPUs.
So the answer is yes... if you can fund the right group. But video encoders are hard. The kind of crack developer teams that can pull this off don't grow on trees.
Shaders have little benefit for anything with "compression" in the name. (De)compression is maximally serial/unpredictable because if any of it is predictable, it's not compressed enough.
People used to want to write them because they thought GPU=fast and shaders=GPU, but this is just evidence that almost noone knows how to write a video codec.
The Elemental encoder was supposedly quite good, but it was a different time. There was no hardware encoding, and high res realtime h264 was difficult.
That's not really true; the motion estimation stage is highly parallel. Intel's wavefront-parallel GPU motion estimation was really cool. Real world compression algorithms are nowhere close to optimal partly because it's worth trading off a little compression ratio to make the algorithm parallel.
IIRC x264 does have a lookahead motion estimation that can run on the GPU, but I wasn't sure I could explain this properly.
That said, I disagree because while motion estimation is parallel, motion coding is not because it has to be "rate distortion optimal" (depending on your quality/speed tradeoff.) So finding the best motion for a block depends on what the entropy coder state was after the last block, because it can save a lot of bits by coding inaccurate/biased motion.
That's why x264 and ffmpeg use per-frame CPU threading instead (which I wrote the decoding side of) because the entropy coder resets across frames.
Or maybe we should have written codecs to be amenable towards GPU shader implementation...