Tom MacWright

tom@macwright.org

Map, a moderately better dictionary datastructure for JavaScript

Objects are in between data and function

From the beginning:

At the bottom of programming languages are primitive data types. They include the booleans, true and false, all sorts of numbers like integers and floating point, characters (letters and numbers), and sometimes other types. Primitive data types can’t usually be broken into pieces - they’re like atoms1.

And then there are composite data types, which combine primitive data types into collections of data. At the very least, most languages have two kinds: arrays, also known as lists tuples, and dictionaries, also called as maps, objects, hashes or structs.2 Lists are used to store data that’s organized like a sequence, whereas dictionaries store data that has named parts, because dictionaries allow you to retrieve each part by name.

If you’re dealing with real-world information, chances are you’ll need to use a dictionary at some point. It’s the datastructure that makes it easy to store people’s phone numbers and email addresses together without needing to remember whether you stored the phone number in [0] or in [1].

{
  "email": "bill@microsoft.com",
  "phone": "1 (555) 555-5555",
  "age": 61
}

From this example and the above introduction, you should have some expectations about JavaScript’s dictionary type. It would be fair to expect objects to work roughly like the diagram of their encoded form from JSON’s documentation:

{string:value,}

That is, they map from string keys to values. Digging deeper, the claim would be

There’s good reason why math notation has a special symbol, ∀, for ‘universal quantification’. There is a big difference between a statement being true for a few examples, versus remaining true for all members of the domain (in this case, all strings).

JavaScript objects don’t map from any string to any value

Unfortunately, this statement isn’t true: objects, as commonly used, don’t actually support mappings from any string to any value.

If you’re curious, you can run these counterproofs in a Node.js REPL or in your browser’s console.

There are some keys that you can’t set:

> var x = {};
> x['__proto__'] = 'test'
'test'
> x['__proto__']
{}

And other keys that already exist for ‘empty’ objects:

> var x = {};
> x['constructor']
[Function: Object]

So, much unlike empty, unopinionated containers for our own data, JavaScript’s objects contain quite a bit of pre-existing functionality, as well as opinions about certain keys that can’t be overwritten.

Why? Well, objects in JavaScript play many roles. In this article, I’m talking about them as containers for data, but they’re just as commonly used in the realm of object-oriented programming, in which they contain functionality, state, and other features bundled up together. So when we use them as containers for data, they have some unwanted properties hanging around.

Why it matters

You can go a long time before getting bitten by this problem, but it will eventually catch up to you. For instance, we often use JavaScript’s objects to contain the values of dictionaries coming from other languages or environments, like the contents of a query string in a URL. A query including __proto__=true is perfectly valid in a URL, but would be incorrectly ignored if query parameters are represented with a JavaScript object.

Similarly, when we use objects to ‘index’ values dynamically, like in this example we take a string value that’s perfectly valid as a value in an object and instead try to use it as key, and fail in that effort.

function indexById(array) {
  var x = {};
  array.forEach(function (val) {
    x[val.id] = val;
  });
  return x;
}

var indexed = indexById([{
  id:"__proto__"
}]); // == {}

Realizing this endemic problem made me think back to all the times I had used objects in this manner, and shudder at the potential bugs that usage might cause.

The ‘right way’ to use objects as data: Object.create(null)

There’s a trick you can use to create objects that don’t have any special pre-filled or protected properties: Object.create(null). See, the properties like __proto__ and constructor that we just identified are due to normally-created objects, the ones you get with {}, inheriting functionality from the Object prototype.

I’ve been using Object.create(null) for data structures in documentation.js with some success, but it comes with drawbacks. As we’ll mention below, Object.create(null) creates an object with no methods, not even useful ones like .hasOwnProperty, so you’ll have to use some dodgy hacks to use those methods when you need them. And, similarly, if you’re passing your objects with no methods into other people’s libraries and APIs, those libraries often reasonably expect that the objects you pass in have the full prototype, rather than are bare-bones data representations created with Object.create(null).

> var x = Object.create(null);
> x['__proto__'] = 'hi';
> x['__proto__']
'hi' // yay!
> Object.prototype.hasOwnProperty.call(x, 'emptyKey') // tedious!
false

Unfortunately, though Object.create(null) fixes some of the problems of using objects as data, it’s non-obvious, somewhat laborious, and not a very popular technique.

Introducing the Map

In an effort to solve this problem and improve JavaScript’s data types, ES6, a new version of JavaScript supported in Node 4+ and a wide range of browsers, introduced the Map. It’s a new data structure that, unlike objects, is able to truly map from any string to any value.

And it does one better, by also supporting other kinds of keys, including numbers3, functions, and objects.

var myMap = new Map();

myMap.set(1, 'This is the number 1');
myMap.set('1', 'This is the string "1"');

myMap.get(1) // This is the number 1
myMap.get("1") // This is the string "1"

Unfortunately, the Map data type isn’t strictly better - in a few ways which I’ll describe, it’s less convenient than traditional objects. But, when you need a way to use a dictionary as an index or otherwise index it with strings that you don’t know ahead of time - in other words, use ‘any string’ as a key - Map is a big improvement.

Pro: testing for emptiness is easier with Map

Testing if a Map is empty is super easy:

> myMap.size() === 0
true

Whereas testing if an object is empty is super error prone. The most succinct way to do it would be this:

> Object.keys(myObject).length === 0
true

But the performance-minded will note that that method is inefficient - for gigantic objects, you’re generating a huge list of keys, just to compare it to 0. Similarly inefficient is the JSON stringification method:

> JSON.stringify(myObject) === '{}'
true

To efficiently test for emptiness, you’ll need this function:

function isEmpty(value) {
  for (var entry in value) {
    if (entry != undefined)
      return false;
  }
  return true;
}

So, in terms of testing for emptiness, Maps have a big advantage.

Pro: testing for a key’s existence is easier with a Map

The Map object has a nice method named has() which tests whether a key has been set.

> myMap.has('hi')
true

For objects, it’s not quite that easy. If you create an object with {}, you might be tempted to write

> myObject['hi'] !== undefined
true

But what if you defined the value as undefined, so the key is, in fact, set? In which case you can upgrade to

> myObject.hasOwnProperty('hi')
true

But what if you created myObject with Object.create(null) to dodge the predefined keys issue? In that case, your myObject doesn’t even have the hasOwnProperty method, so you’ll need to borrow it from the Object constructor itself:

> var myObject = Object.create(null);
> myObject['hi'] = true;
> Object.prototype.hasOwnProperty.call(myObject, 'hi')
true

That’s gross, and most people won’t go through the trouble to do it ‘the right way’, especially because the right way looks like a hack.

Con: Maps and JSON don’t get along

The flipside of the Map’s flexibility in terms of keys - that it can accept numbers, objects, and other types as keys - is that it’s no longer similar to JSON’s idea of objects. JSON’s objects are, in fact, just strings mapped to values, so they are a subset of what a Map can represent. For this reason, the Map data type doesn’t fluidly become JSON like a traditional JavaScript object can.

Stringifying a simple JavaScript object:

> JSON.stringify({ x: 1 })
'{"x":1}'

Stringifying a Map:

> var myMap = new Map();
> myMap.set('x', 1)
Map { 'x' => 1 }
> JSON.stringify(myMap);
'{}' // oops, that didn't work!

So… simply calling JSON.stringify doesn’t quite work in this case - you’ll need a method and sadly that method will simply convert the Map to an object before turning it into JSON.

function mapToObject(map) {
  var o = {};
  map.forEach(function (key, value) {
    o[key] = value;
  });
  return o;
}

Con: Maps don’t work very well with Flowtype

Flowtype is a level on top of JavaScript that provides types, just like C++, and helps you catch errors before a program even runs. It works incredibly well with traditional JavaScript objects - you can declare which keys can contain which values and it’ll enforce that you use those values correctly from then on.

Unfortunately, like Immutable.js’s objects, Maps are too dynamic for Flow to properly track them, so you can’t define Flow types for the data in a Map.


Summing up, here are some of the things you might want to do with data, and whether they’re easy, hard, possible or impossible with objects and maps.

  Object Map Object.create(null)
is it empty? hard easy hard
keys can be any string impossible possible possible
keys can be non-strings impossible possible impossible
asking if a key is in it easy easy hard
you can define Flow types possible impossible possible
JSON stringify & parse easy hard easy

Footnotes

March 13, 2017 @tmcw