Charts in 2015

https://archive.org/details/S63-16822

d3 has a good idea. Unlike its many predecessors like HighCharts, that aimed to individually implement every chart type, d3 went a level down: giving users the fundamental tools they need to implement hundreds of unique graphics. This approach was wildly successful: it made d3 hard to outgrow, so professionals could write front-page graphics for newspapers with the same basic elements as beginners use to implement bar graphs.

It's been a while since I worked on a d3-centric project like iD, geojson.io, or given a talk about d3. In the past year of working on an application based on React and WebGL, I've had time to process where d3 & charting libraries fit into the style and form of the web.

SVG

To quell the kneejerk: d3 isn't tied to SVG: it works perfectly well with HTML and fairly well with Canvas - even a little bit with WebGL. And for a specific range of usage, SVG is acceptable, and will continue to be perfectly usable.

But while JavaScript has consistently gotten faster over time and even the DOM has some promising perf tweaks like the shadow DOM in the works, SVG and its implementations have stalled for years. The performance threshold of "too many SVG elements" is so low that we hit it almost immediately in iD. The browser bugs were numerous and many had been reported for years with no action. It isn't a panacea, but WebGL/Canvas is the one way of breaking free of this system and increasing performance in an application like iD - and it requires a fundamentally different approach to rendering than the DOM-element-like system in d3.

Animation

Animation on the internet is a failed experiment.

Virtually every attempt to use CSS transitions to do non-frill effects runs into a bug or something that should have been in the spec. You can choose the flakiness of CSS's transition property, or the performance hit of setting style properties in JavaScript. Performance and robustness don't come in the same package. And requestAnimationFrame doesn't fix that. Once the DOM is on a separate thread from JavaScript we can implement the animations we've always dreamed of, but for now it's a hack, and a bad one.

Animation is a high priority for d3, and the library implements them directly: instead of using CSS transitions or jQuery-style shortcuts, it gives a robust API for tweening properties and styles. The advanced effects this enabled are one of the main whiz-bang features of the library, and are remarkably browser-safe because they put so little trust in browsers.

Unfortunately, this comes at a high cost. The .enter(), .exit(), update separation built into d3's data joins is in large part because of animation. .exit() could simply zap elements from the page, and .enter() could automatically append elements to the page. And instead of 3 ways to change the DOM given the state of each element, there would be one, and d3's DOM-related API would be vastly simpler.

This is what React is. React builds a DOM by comparing and filling in the difference between it and a virtual DOM. It even uses keys, just like d3 to simplify the joining computation.

For React, transitions are deemphasized: while react-animation aims to implement them as a third-party library, the core idea is that the DOM you propose in render() is what the page contains, immediately. In my experience, this vastly simplifies the art of imagining what's on the page: whatever the render() method returned is the truth.

The Future

But in the meantime, the truth is that d3 is the best game in town: even if transitions are a boondoggle and SVG is stagnant, there's no other charting library with such a solid conceptual foundation.

The biggest change to d3 in years is going to be Mike's modularization of the library, which will enable other developers to include chunks of it with browserify, node, or webpack. I can't imagine it's long before someone implements a charting library in React-ART using those parts.

Vega has potential: its declarative way of representing charts makes it uniquely cross-language compatible - Python, R, or Julia could easily generate Vega specifications for their output and eventually grow their own renderers as well. The hope is that the UW Interactive Data Lab gives Vega the support it needs to take off.

How Everyone Documents

Markdown Documents

This is what we've done with Mapbox.js and Leaflet does it in HTML. Heck, node takes this approach. But I've felt the burn of the "just do it manually" approach:

  • It's really hard to refer to other things in Markdown. If you want to link to a different class, where is it? The answer changes whether you're in GitHub, with its auto-generated permalinks, or on a website with different pages or different permalinks.
  • Combining DIY documentation, as we have done with Mapbox.js+Leaflet, is a big ol' hack that just doesn't work well.
  • Markdown tables are the farthest thing from tables possible, but we usually want function parameters to be in tables. Yes, I hear your groaning - this might not mean the HTML table element, but it means some HTML structure more complex than an ordered list.
  • Markdown isn't data: I really like it for text, but it just ain't the one for everything else. And YAML doesn't cut it. It isn't simple and isn't right for this purpose - using "YAML frontmatter" would restrict the way we separate and format docs.

Documentation Generation

This is popular in other languages, like Go and Python: you write some 'magic comments' in your code and a tool analyses your code to extract them as data and turn them into readable formats. A neat bonus is that fancy coding environments can read these docs too and give inline help: the most famous example is Intellisense, which ranks just below Word For Mac 98 as Microsoft's best work.

Literate Programming

The literate programming approach is of historical and linguistic interest: instead of documenting the outside interface of a program, you try to make all of the guts understandable, documenting individual loops and variables in crazy detail. I've tinkered with this, building literate-raytracer, simple-statistics, and literate-game-of-life in this style. It's a fantastic exercise in questioning your confidence: is this really good code, and do I really understand what I'm writing at a deep level? But it's the highest work level of any documentation style, and it fails dramatically at being skimmable: I've never seen literate programming as an effective API doc.

Recently

Tour was amazing: Teen Mom & BRNDA drove across the middle of America, ordering from the Taco Bell vegan menu, meeting bands and dogs, playing pool, swimming in pools.

#VSCOcam

Photo

Lake Michigan

#VSCOcam Milwaukee river

Milwaukee River

#VSCOcam

Spooky Christmas #dogsofcinci

Spooky Christmas, a dog we met in Cininnati

#VSCOcam

#VSCOcam

#VSCOcam

The Comet in Cinncinati

#VSCOcam

We watched hours of this

Photo

Pittsburgh

Photo

#VSCOcam

#VSCOcam

Some of the best bands we played with:

Consumption

Listening