Flow in Mapbox Studio

I have not avoided certainty

It has always just eluded me

I wish I knew

I wish I knew for true

Errors, bugs and mistakes are the dark matter that surrounds working code. Software favors entropy, disintegrating code standards, from the moment a program grows beyond console.log('hello world'). Mapbox Studio is a large piece of software with many types of potential bugs, enabled by technical debt and fast development cycles. Most large software is the same way.

I’ll get to the point soon, but first there are two principles.

The Rule of Repair: When you must fail, fail noisily and as soon as possible.

This is one of the ‘Unix principles’, and I think it’s one of the most interesting because newbies tend to think the opposite. I’ve seen code that uses try catch blocks every place an error has occurred, trying to silence error wherever possible.

And isn’t a program that works for a few minutes before crashing better than one that crashes immediately? Who would want to expedite failure, rather than defer it? But to say this differently - failing ‘soon’ means failing at the automated testing stage, or at your text editor’s linting stage, or during QA, rather than in production. Wouldn’t you prefer all of those, to failing under the eyes of an actual end-user?

Here are some errors organized by time:

Fast: syntax highlighting in vim tells me I forgot to write the closing ‘
git pre-commit hook runs npm test and alerts me of a new logic error
CircleCI runs browser integration tests in Sauce Labs and finds a new production error
A logic error sneaks through static analysis and is triggered by a user’s unusual configuration
Slow: A memory leak grows slowly and mysteriously takes down a server every month

Note that these are also ordered by level of aggravation.

The diversity of failure: Errors come in many different kinds. Read the messages, understand the species and types of the errors. The types and details matter.

I wrote a bit about this in errors and bugs. Like the rule of repair, people usually start off on the opposite side. There’s a binary concept of ‘working’ or ‘not working’, especially when you haven’t used developer tools and seen error messages.

Knowing the varieties of errors pays off. Once you have a working knowledge of TypeError versus ReferenceError, you’ll be able to glance at your developer tools and know not just where the error occurred, but the most likely cause and fix - given just 100 chars or so of information.

And in the longer term, knowing the types of errors enables pattern matching. Instead of reviewing a day’s work and saying ‘I sure had a lot of errors today’, you can say ‘I had a lot of SyntaxErrors’ and you can actually construct a plan of avoidance.

For instance:

Syntax Errors: neomake or syntastic for vim, or eslint for atom or eslint for sublime text
ReferenceErrors: eslint will catch these too
TypeErrors: flow or typescript

Obligatory note that none of these are a silver bullet. Never assume anything is a silver bullet.

eslint

eslint is an unsung hero of the JavaScript world and Mapbox’s software. It’s the successor to JSHint, and both are static analysis tools: they’re software that analyzes your code statically (without running it) and analyzes it. In eslint’s case, that analysis is usually detection of code style problems like semicolon rules. But eslint has a tiny big of actual, real, amazing bug-finding ability in its no-undef rule.

Before, we could write code like var b = a; and find this bug haunting us in production:

var b = a;
^
ReferenceError: a is not defined

eslint doesn’t fix the bug or cover it up: it makes your software fail sooner by failing eslint’s no-undef check. Through eslint, this code produces this error as soon as it hits eslint - whether in near-realtime in your editor, in the npm test script, or in continuous integration like CircleCI or TravisCI.

/Users/tmcw/src/undef.js
  1:5  error  "b" is defined but never used  no-unused-vars
  1:9  error  "a" is not defined             no-undef

✖ 2 problems (2 errors, 0 warnings)

Simply introducing eslint to a number of Mapbox projects identified crashing bugs within a few minutes. These bugs were in code-reviewed, fairly well-tested software that had been in production for months or years.

Flow

Flow is a powerful and intimidating piece of software at first. It’s a “static type checker” that requires hand-written type annotations in your code and is powered by a magical backend written in OCaml. Its issue tracker is filled with people requesting specific sorts of type inference, refinements, unions, sentinels, and other dungeon creatures.

Culturally, it’s important to understand that this is partially a reflection of type-oriented programming: there’s a significant subset of programmers whose first priority is to get a robust and complex type system in place that represents all of the needs of their application, and once that’s done, the program should write itself. Like the functional programming set, there’s a jungle of jargon and some legitimate but distant mathematical underpinnings for this approach.

Type obsession is not for me: I view Flow as a better eslint. An eslint you have to try harder to use, but rewards you so much more in the end.

So what does Flow do, for us?

Flow can find null references

Like eslint, Flow identifies missing variables that would cause ReferenceErrors. But eslint isn’t able to think beyond the existence of a variable. This code

var world = { hello: 'Tom' };
console.log(world.hi);

Travels through eslint without a single complaint, even though it logs undefined.

But through flow:

/* @flow */
var world = { hello: 'Tom' };
console.log(world.hi);

It properly warns you before the code even runs:

undef.js:3
  3: console.log(world.hi);
                       ^^ property `hi`. Property not found in
  3: console.log(world.hi);
                 ^^^^^ object literal

Magic.

This is a contrived example. But much more common is the instance where an object actually comes across a module boundary:

/* @flow */
var fs = require('fs');
fs.readFileSink('foo');

Produces:

undef.js:3
  3: fs.readFileSink('foo');
        ^^^^^^^^^^^^ property `readFileSink`. Property not found in
  3: fs.readFileSink('foo');
     ^^ module `fs`

It’s syntactically valid JavaScript code, but it crashes the moment you run it: Flow makes it crash even sooner.

Flow enforces argument types

Mapbox Studio has a few utilities that are required throughout the application, like redux action creators and mathematical utilities. It’s essential that these are called correctly and that when we refactor a shared module the changes are reflected application-wide.

This was the first place we did type-checking, but we started with tcomb. It’s a library that does similar type-checking to Flow, except in vanilla JavaScript - it doesn’t require extra preprocessing.

We would write action creators like

destroy: func([Obj, maybe(Str)], Prom).of(function(source, loadingType = 'blocking') {
  return async(MapboxClient.mapDestroy(source.get('id'))
    .then(() => {
      dispatch({
        actionType: SourceConstants.SOURCE_DESTROY,
        value: source.get('url')
      });
  }), actions.source.destroy, arguments, loadingType);
}),

Wrapping the creator in tcomb’s func method and giving types to it meant that calling it with any other types would instantly make the application throw an error. This prevented actions from ever introducing sneaky errors into the application and made refactoring more predictable, but still only triggered the errors in runtime.

With flow we write the same action like

destroy: function(source: Object, loadingType: ?string = 'blocking'): Promise {
  return async(MapboxClient.mapDestroy(source.get('id'))
    .then(() => {
      dispatch({
        actionType: SourceConstants.SOURCE_DESTROY,
        value: source.get('url')
      });
  }), actions.source.destroy, arguments, loadingType);
},

This way, if we change the signature of an action, we can run flow and immediately see all the places it might be called with incorrect parameters. It’s much less guesswork and has made refactors easier.

Flow works really well with React

Mapbox Studio uses React for its components. React Components are similar, externally, to HTML’s elements: they accept attributes and render the results. We heavily use React’s PropTypes to enforce the types of their attributes (in React’s lingo, props). PropTypes are pretty powerful: you can specify whether a prop is required, whether it’s a string, or an array, or even a complex object with a specific structure. In practice, they’re assistive type-checking: if you give an incorrect type or forget a prop, you get an informative error rather than something deeper in your application.

Flow has been identifying a specific sort of PropTypes mistake we’ve been making with astonishing effectiveness:

Here’s an example module:

/* @flow */
var React = require('react');

var Example = React.createClass({
  propTypes: {
    val: React.PropTypes.any.isRequired,
    onChangeMultiCheckbox: React.PropTypes.func
  },
  onChangeMultiCheckbox() {
    this.props.onChangeMultiCheckbox(this.props.val);
  },
  render(): React.Element {
    return (<div key={this.props.val}
      onClick={this.onChangeMultiCheckbox}>
      {this.props.val}
    </div>);
  }
});

module.exports = Example;

Perhaps you can spot the error? Flow can too:

example.js:10
 10:     this.props.onChangeMultiCheckbox(this.props.val);
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ call of method `onChangeMultiCheckbox`. Function cannot be called on possibly undefined value
  7:     onChangeMultiCheckbox: React.PropTypes.func
                                ^^^^^^^^^^^^^^^^^^^^ undefined


Found 1 error

Flow notices that we haven’t marked the onChangeMultiCheckbox as required, but we treat it as if it’ll always be there. Because of this mistake, we can instantiate an <Example> component without a onChangeMultiCheckbox prop and it’ll fail in production when a user clicks the button. If you guard the reference to onChangeMultiCheckbox by checking for its existence, or make the prop required, the Flow error disappears, and so does the potential for failure.

The cost

Flow doesn’t ‘just work’. You need to write Type Annotations all over your application, and then remove them using something like Babel, before it can run in production again. A lot of the extra syntax Babel supports in your JavaScript have a high chance of being accepted as core parts of ES6, ES7, or some future standard - Flow’s annotations might in a very far future, but it’s much less likely.

These annotations take time to write: I’ve been slowly covering the Mapbox Studio codebase and have finished 326/401 source files - measuring progress along the way with are-we-flow-yet. Sometimes the annotations take a while to write, sometimes source needs actual changes before it can be annotated.

Vanilla

Please don’t take this as a signal to use Flow everywhere. I work on Mapbox Studio’s codebase most of the time. It’s an application that uses ES7, Promises, Flow, Redux, JSX, every trick in the book. But every single part of living in that post-JS future has a price. Testing is harder, test coverage is nearly impossible, people’s editors need new configs, team members take longer to onboard, knowledge transfer is harder, builds take longer. Really, we have introduced these new technologies as they’ve become unavoidably necessary, not for fun.

And Mapbox Studio is an application: its code-compatibility with other pieces of software is not a top priority. You don’t expect to require('mapbox-studio') and do things with it. It is a consumer of modules, and it’s complex. Simple modules are simple, and can be required by other code: if you only have 50 lines of code to write, write it in ES5 vanilla JavaScript and it’ll be faster, simpler, easier to test, and better than if you had used Babel and all the latest tricks. Structural complexity should match the complexity of the task.

January 5, 2016 Tom MacWright (@tmcw, @tmcw@mastodon.social)