This comment is mostly why I have never tread in the feared world of multimedia....

jfb · on Oct 12, 2014

Audio and video really aren't that closed; they just require a commitment to learning things that are pretty different than what a typical programmer does; and they get math heavy. But they're really, really fun to play with, when you knuckle down, and there's a fairly gentle learning curve from understanding formats and containers to samples to statistical signal processing and psychoacoustic and visual models.

And being good at it means writing your own ticket, if you're a careerist.

lifeisstillgood · on Oct 12, 2014

Is there a recognised / common syllabus for that gentle learning curve - it might make a worthwhile hobby ... But it's definitely a "one day" for the moment.

chipsy · on Oct 12, 2014

I don't know of a defined syllabus but here is what I did:

1. Write some basic synthesis code(naive oscillator and volume envelope) and some way of sequencing it. Write little toy trackers and some procedural audio with this technique.

2. Try and fail a few times to write dataflow engine for audio like Pure Data. (this is kind of a big project and it turns out, you really don't need it to experiment.)

3. Write Standard MIDI Format playback system for a one-voice PC beeper emulation. This turns out to be a gateway drug for me learning more in depth because all you have to do is add "just one more" feature and every MIDI file you play sounds a little better.

4. Expand MIDI playback and synthesizer in tandem. End up with polyphonic .WAV sampler, then Soundfont playback. Learn everything about DSP badly, and gradually correct misconceptions. (DSP material can be tricky since the math concepts map to illegible code, and a lot of the resources are for general engineering applications instead of audio.)

5. Rewrite things and work on custom subtractive synthesizer, after getting everything wrong a few times over in the first synthesis engine. Still do not know to write nice sounding IIR filters; steal a good free implementation instead.

And that is where I stand today. I know enough to engineer complex signal chains, how some of the common formats are structured, and some tricks for improving sound quality and optimizing CPU time; what I miss is the background for writing original signal processing algorithms, which gets really specialized(there are some people who devote themselves entirely to reverbs, for example). These algorithms and their sound characteristics are effectively trade secrets, so the opaqueness of the field is not just a matter of the problems being hard - DSP just hasn't become as commodified as other software.

lifeisstillgood · on Oct 13, 2014

With due respect to yourself (and jfb) the first paragraph is what I am talking about. I can just about guess you mean produce a sinewave, chop it into time slices and output those in some audio format.

I think I will give it a go shortly - close your ears :-)

jfb · on Oct 13, 2014

Chipsy's peer comment is excellent. For video I'd:

1. write an MPEG-4 parser -- this is much simpler than you probably think;

2. decode the H.264 metadata;

3. decode the H.264 picture data and write it to files, one per frame -- do not be ashamed to use an existing decoder!

4. put these frames back into a MJPEG, for instance;

5. try your hand at developing a DEAD SIMPLE I-frame only video codec, using e.g. zip for the frames.

This could teach you if you are interested in video without too much conceptual overhead. I had a friend and coworker who did #5 in Ruby, so don't think you need to get into GPU vectorization and signals theory right away.