Tom MacWright

tom@macwright.com

Fuzz Testing

Fuzz testing is a neat and under-appreciated way to find bugs in software by brute-force.

Fuzzing is just creating random invalid input and seeing what happens. With enough invalid input, it’s often possible to uncover stuff that will crash servers and cause JavaScript to throw exceptions.

Fuzzing is especially great for parsers. Let’s say you have a parser for math expressions:

2 + 5 = 7

A fuzzer will mutate this into a bunch of invalid variants, and run them against your code. The vast majority will be invalid, some minority will be valid but wrong, and then, usually, there will be a few that throw a wrench in the gears and cause a crash

2 + 5= 7 // valid
2 5= 7 // invalid
2 + 5 = 7fdsaflka // invalid
.+ 5 = 7fdsaflka // CRASH

In The Wild

I’ve written a handful of parsers and generators, like tokml, togeojson, geojsonhint, and others. They’re usually satisfying to write, because they can be purely functional and sometimes get to use good specs, like GeoJSON, as their basis.

These libraries are deployed in places like geojson.io, mapbox.com, and elsewhere. Geojson.io has been essential to the development process, because it’s hooked up with getsentry, a tool that catches crashes and errors and emails me backtraces. This way, I’ve learned about a lot of bizarre ways in which files can be invalid - weird variations on KML, GeoJSON, WKT, and absolutely everything else.

I’ve learned that the range of data malformations and screwups is much more than anyone can imagine: every possible variation exists in the wild. So, fuzzers to the rescue: I wrote a tiny bit of code that does fuzzing, called fuzzer, and used it with a few libraries, and it quickly identified corner cases: in tokml, a KML generator, wellknown, a WKT parser, and elsewhere. fuzzer then grew a binary called fuzz-get that runs messed-up GET requests against an API endpoint, so that I could make sure some new Mapbox web services wouldn’t crash.

The only real neat implementation detail of fuzzer is that it uses random-js with a fixed seed, so it provides random-seeming mutations, but always produces the same series. So tests that pass locally will pass everywhere, forever.

For example, the tests for wellknown now include this fuzzing run:

test('fuzz', function(t) {
  fuzzer.seed(0);
  var inputs = [
    'MULTIPOLYGON (((30 20, 10 40, 45 40, 30 20)), ((15 5, 40 10, 10 20, 5 10, 15 5)))',
    'POINT(1.1 1.1)',
    'LINESTRING (30 10, 10 30, 40 40)',
    'GeometryCollection(POINT(4 6),\nLINESTRING(4 6,7 10))'];
  inputs.forEach(function(str) {
    for (var i = 0; i < 10000; i++) {
      try {
        var input = fuzzer.mutate.string(str);
        parse(input);
      } catch(e) {
        t.fail('could not parse ' + input + ', exception ' + e + '\n' + e.stack);
      }
    }
  });
  t.end();
});

There are several types of fuzzers, and fuzzer is the simplest: it mutates known ‘good’ input. Inside, all it’s doing is incrementing and decrementing numbers, chopping letters off of words, messing with object properties: not rocket science. Down the line, it should learn to generate input from given models. It’s somewhat limited in what it can produce - only JavaScript objects in and out. To really kick the tires on toGeoJSON it’ll need to generate and mutate XML documents. In other languages there are pretty solid and tested fuzzers, like rfuzz for Ruby, sulley for Python, fuzzdb, and untidy for XML.

Not all code needs to be bulletproof. In the past, I’ve believed that core code had the right to crash on invalid input. But, for parsers, a ‘validate’ step is essentially a parse in and of itself, so where do you check? And given the hackiness of using try{}catch{} everywhere, and the idea that code is eventually exposed to the outside world, I think parsers must be more tolerant of problems.