It's not just a fuzzer, it guarantees to hit every part of the spec (subject to what "profile" you're implementing). It's not free, it's a product for sale to implementers of HEVC for verification purposes.
AFL might get significantly more code coverage than these test streams do because it actively seeks out more code coverage by observing the behavior of an individual binary on various inputs.
You could imagine a parser that deals correctly with every single one of the test streams and implements every single feature in the spec, yet also has an undetected exploitable vulnerability because it made an assumption about objects' sizes (which the spec permitted it to make, but which an attacker could take advantage of).
(On the other hand, maybe I don't understand enough about H.264 to appreciate a reason why this isn't possible in this specific context.)
One extreme might be if you had a backdoor where the presence of a particular byte sequence in the input intentionally triggers some kind of malicious activity. The test streams can't detect this because they presumably don't contain that exact byte sequence, whereas something like AFL can find it because it can (potentially, depending on the nature of the test that recognizes the backdoor sequence) deduce what input would trigger coverage of that code path.
Fuzzing is fun, and still there are easy-to-discover issues lurking in even widely used tools.
For example I setup a site where I require users to upload an SSH key (for access to a git repository), and figured I'd do what github, etc, do in the display - show the fingerprint.
Given an SSH key you can get a fingerprint like so:
Yeah, AFL is an incredibly useful bit of work - I ran OpenJPEG through it and a number of reasonably actionable bug reports for the maintainers after a few hours. That class of tool used to be a LOT noisier.
There is one in that list for OpenBSD kernel in 2016. When tmpfs was removed from OpenBSD the reason given was "lack of maintenance". The problem found by afl had nothing to do with the decision?
I don't think that's what that article is about. The intercepted images from Lybian internet traffic and by some method akin to steganography he encoded Gaddafi's book, 'The Green Book', into the images.
Thank you for being the one to express the childlike wonder that I felt. The other comments are informative, but seemed to hit the technical aspects of this post (entirely appropriate on Hacker News!) as opposed to my simple, open-mouthed "wow!"
So afl-fuzz could technically be considered the most advanced universal software license key generator? :) I suppose the days are numbered for offline validation.
Ignoring the actual "fuzzing" going on here—wouldn't it be possible to use this approach (or something like it) to make a sort of 'universal wire-protocol auto-discovery-and-negotiation library'? Picture a function like the following:
I imagine that if we're ever doing the Star Trek thing, gallivanting around in starships encountering random alien species with their own technology bases, this would be the key to anyone being able to meaningfully signal anyone else.
You're right; this approach wouldn't work with an "opaque peer", as most machines on the Internet would count as.
However, assuming the above sci-fi scenario—a cooperative peer that wants to communicate, but shares no common signalling standards with you—you could assume that everyone is doing broadcasts on loop of their daemon binaries to let their peers analyze them.
This would require some bootstrap logic on each side, to find what signalling methods their peer is using to broadcast the bootstrap. This could be made an approachable problem if everyone assumes everyone else will use "lowest-common denominator" signalling technologies for their bootstrap transmissions (e.g. binary over AM radio.)
You'd also have to figure out the ISA of the recovered binaries, in order to begin to fuzz them. This isn't impossible to do automatically either; it basically requires an "inverted" fuzz process—fuzzing up a VM with various decoding strategies and opcode definitions, and then seeing what it does when fed the provided binary, attempting to find something that produces a "normal" program execution (maybe according to a neural network trained on what memory-states of running programs usually look like.)
---
To jump back out of the sci-fi context, though—if you assume an "opaque peer" that nevertheless has its source code published online somewhere (e.g. as FOSS on GitHub), then you can do something like this:
1. set up a web spider that downloads source code from the internet, compiles it, and then fuzzes the resulting binaries (presumably all in a sandbox) to create mappings between project URLs and fingerprints of behavior under fuzzing.
2. when trying to communicate with an arbitrary server, first fuzz it a bit to try to fingerprint the response using the index.
3. Take the results as a fingerprint; feed them to the index service; get a source URL; clone the relevant project.
4. Fuzz-analyze the source code of the project along with the responses of the live peer, comparing the two as you go to try to get your local "mental model" of the peer as aligned as possible with how it's really behaving. (E.g., use the live peer's responses to generate the config file for your local copy of the project.)
5. With a behavior-matched local copy, generate an IDL client as above. (Which can, of course, be a very easy process if it turns out the server obeys some previously-discovered protocol.)
Technically, you just need fine-grained enough measurement of the request/response delay, since you could probably use time of response to measure how many instructions had been executed in response to your fuzzed input.
Over the internet, this probably gets lost in the noise, but maybe with enough inputs, it wouldn't.
It's theoretically possible that two branches with relevantly different behavior will each result in the execution of an equal number of instructions, though. Then you can't actually tell that the behavior behind the scenes was different if the data sent over the wire is unchanged. (By contrast, AFL can tell that a branch was taken or missed even if the output is the same because it directly sees the execution state, just like in the beginning of this JPEG synthesis example where it noticed that the first byte should be 0xff even though that didn't change the output.)
But your idea is very clever and seems like a potential new application of timing attacks! Someone should definitely try this.
I'm not sure if it's required (to know the exit code). inputs, both valid and invalid, will activate various paths in djpeg/libjpeg, and that's the basic goal here.
In case you'd like to test persistent fuzzing (should be faster due to a few factors), you can try modifying https://github.com/google/honggfuzz/blob/master/examples/lib... - it will probably take creating main() func which will read data in the AFL_LOOP() loop and call LLVMFuzzerTestOneInput with that input.
Is that feasible with source code instead of JPEG? Compilers and interpreters also have tons of warnings and errors that could help the fuzzer. Is anyone aware of such an experiment?
What if the probability of generating an image that is much larger than what would fit in memory would be greater than generating a normal-sized picture? Would the fuzzer be able to produce something in practice?
It's very common when fuzzing. That's why you normally want to place memory limits on the target process, to avoid bringing the system down. AFL does that automatically, most other fuzzers have a config option.
Running Windows as guest instead of as host OS has made my life significantly easier. The sole exception used to be games, but that's getting better, too.
honggfuzz (http://honggfuzz.com) will run under cygwin - it uses similar algorithms (feedback driven by code coverage metrics), and last time I checked it under CygWin it worked with clang's -fsanitize-coverage=trace-pc modes well.
why? the peformance hit on VM's isn't that big, if you really care why don't you just run one? 7 hours or 27 isn't that huge of a difference in the grand scope of things . . .
It's not just a fuzzer, it guarantees to hit every part of the spec (subject to what "profile" you're implementing). It's not free, it's a product for sale to implementers of HEVC for verification purposes.