Tom MacWright

tom@macwright.org

Miscellaneous

Like many coders, I also do it for fun. Sometimes for the fun of learning a new technology or solving a problem, or just to explore ideas. Self-education is great, and learning by doing comes naturally to me.

So, I’m starting ‘Miscellaneous’, a complement to monthly Recently posts and a more fun, less-intense addition to technical posts about tiny things. And, to be clear, this is about learning, about things that I’m far from an expert in, so take it with a grain of salt. If I write something that’s wrong, I’m happy to take input.

This weekend started off with the Friday workday, in which I grinded away at documentation.js, refactoring the intricate way in which infers membership based on context. While satistfying, it confirmed a feeling that JavaScript is great, but sometimes I need a break - and it’d be great to hack on non-JavaScript things when not at work.

anyway

I was thinking back on this quote from Kanye West Superstar, a flippant and incredibly enjoyable book about Kanye West by Byron Crawford.

Having sold over a million copies, 808s & Heartbreak was successful relative to, say, a Little Brother album, but the fact of the matter is that each subsequent Kanye West album has sold fewer copies than the one that came before it, going all the way back to The College Dropout in ‘04.

I’m sure he made more money from 808s & Heartbreak than I’ll make in my entire life, but still. He was reaching the point in his career at which his media presence outstripped his actual commercial appeal. Hardly a day went by when he wasn’t in the news, and yet he was merely a 1x platinum artist. Not an umpteenx platinum artist like Adele or someone.

Crawford, Byron. Kanye West Superstar (p. 103). Byron Crawford. Kindle Edition.

$1M$2M$3MThe College DropoutLate RegistrationGraduation808s & HeartbreakMy Beautiful Dark Twisted FantasyYeezusThe Life of Pablo

These are Wikipedia’s numbers, which show Late Registration as Kanye’s peak. Unfortunately, album sales data is an estimate at very best, so it’s hard to really get the answer here.

For the curious, my process to make that chart was to copy the table into Numbers.app, export to PDF, import to Sketch, improve the design, export to SVG, optimize with svgo, and then copy & paste the SVG content directly into this Markdown-formatted post. I obsessively optimize this website, so this fits the qualifications of being the smallest possible resolution-independent representation of data, without requiring JavaScript or any external services.

Getting Billboard data

So I wanted to look at the success of artists over time, and ran into a lot of dead ends. The RIAA’s US Sales Database is less a database than a set of visualizations of aggregates. Alan Guo’s billboard python module finally solved the problem:

import billboard
import json

chart = billboard.ChartData('hot-100')

while chart.previousDate:
    json.dump(chart.entries,
              open("charts/%s.json" % chart.date, 'w'),
              default=lambda o: o.__dict__,
              sort_keys=True, indent=4)
    chart = billboard.ChartData('hot-100', chart.previousDate)

And soontI had a charts directory filled with data like

[{
  "artist": "Ricky Nelson", 
  "change": "0", 
  "lastPos": 1, 
  "peakPos": 1, 
  "rank": 1, 
  "spotifyID": "33FPsMEl3UwpytDuyf9VYq", 
  "spotifyLink": "https://embed.spotify.com/?uri=spotify:track:33FPsMEl3UwpytDuyf9VYq", 
  "title": "Poor Little Fool", 
  "videoLink": "", 
  "weeks": 11
}]

For every hit, going back to 1958. Then I wanted to rearrange chart data over time, so that instead of having charts, I had a song-oriented representation, so like:

Implementing this kind of transform in a language I’m comfortable with, like JavaScript, Python, or Ruby would take about 15 minutes: I implemented it in Ruby to confirm that thought. In my mind, these all blend together - imperative, mostly-dynamically-typed languages are all roughly ‘the same thing’, with maybe a 3x time difference between my familiarity with my native tongue, JavaScript, and something I don’t write often, like Ruby.

Weekend coding requires harder stuff: a different language family demands much more of you. So, I decided to go for the hard stuff: first, Haskell.

Haskell

And, predictably, I failed. Tinkering with Elm made this experience with Haskell much better than all of my past experiences: Elm is a godsend of a learning tool for functional programming, a smart, friendly community combined with a usecase where FP shines. But, I got maybe halfway in about 4 hours of struggling. Haskell’s tooling was seamless, cabal worked perfectly, and there were even helpful articles about using Aeson, the leading JSON parser.

Unfortunately, though I’m comfortable with Elm’s Effects system for input, output, and things that might be asynchronous, Haskell’s IO monad proved to be frustrating. I wanted to learn about do syntax, which seemed like a decent way of dealing with the IO monad, but the second result was about how do notation is harmful, and that article’s recommendations are vague.

Perhaps it’s useful to note that Either a b is also called the coproduct, or sum, of the types a and b. Indeed it is now common to use

The Functional Programming Voice affliction ran strong in StackOverflow answers.

Working productively with the IO monad is the top of my list when I return to Haskell land. This is the farthest I got.

Rust

So, next up was Rust. This time I succeeded, and was able to rearrange the data, using serde for JSON parsing.

Rust is an incredibly exciting new systems programming language, initially sponsored by from Mozilla. It’s, hopefully, the replacement for C++, making programs easier to write and safer, in terms of memory. Haskell, in fact, is one of Rust’s many influences. It has attracted cool kids like Steve Klabnik and Ashley Williams to try making faster programs that run safer.

Rust’s culture is a lot different than Haskell’s: it’s a new language born in 2010 - 20 years after Haskell and 15 after JavaScript. And the community is, like many others now, being intentional about being accepting and helpful to beginners, to try to ward off the elitism and monoculture of the past.

Where the IO Monad loomed large in Haskell, Rust’s borrow checker is the majority of the learning curve for beginners like me, and it certainly took time to understand. Francis Gagné was an invaluable help when I asked the StackOverflow masses: my fundamental mistake was forgetting about the difference between references and objects themselves: switching from .iter to .into_iter made all the difference.

Hours later, after listening to both Yip/Jump music and 1990, I had a ‘finished product’.

Now, it’s pretty clear that there aren’t many people using Rust to generate PDFs - Rasmus Kaj’s improved pdf_canvas crate works well but isn’t very complete. And the quantity of data that I’m trying to visualize in a ‘flat representation’ is enough to crash Preview.app.

Next weekend: maybe I’ll dive into Haskell and see if I can make progress. Or I’ll be in Atlanta visiting friends and won’t have time to code. Yep, the latter - the next Miscellaneous is probably the weekend of the 11th.

February 26, 2017 @tmcw