Tom MacWright

tom@macwright.com

I wish there was a better default for database IDs

Every database ID scheme that I’ve used has had pretty serious downsides, and I wish there was a better option.

The perfect ID would:

  • Be friendly to distributed systems - multiple servers should be able to generate non-overlapping IDs at the same time. Even clients should be able to generate IDs.
  • Have good index locality. IDs should be semi-ordered so that new ones land in a particular shard or end up near the end of your btree index.
  • Have efficient database storage: if it’s a number, it’s stored as a number. If it’s binary, it should be stored as binary. Storing hexadecimal IDs as strings is a waste of space: Base16 takes up twice as much space as binary.
  • Be roughly standardized and future-proof. Cleverness is great, but IDs and data schemas tend to last a long time, and if they don’t last that long, need to survive migrations. A rare boutique ID scheme is a risk.
  • Obscure order and addresses - in other words, not be an auto-incrementing number. It is bad to reveal how many things are in a database, and also bad to give people a way to enumerate and find things by tweaking a number in a URL.

Almost nothing checks all these boxes:

  • Auto-incrementing bigints are almost perfect, but they aren’t friendly to distributed systems because only one computer knows what the next number is. They also reveal how many things are in a database. You can use Sqids to fix that, though - a surprisingly rare approach.
  • All of the versions of UUIDs that are fully standardized have pretty bad index behavior, and cause poor index locality - even v1. But they’re very distributed-systems friendly, and they definitely obscure numbering.
  • Orderable new schemes like ulid are cool, but there isn’t a straightforward way to store them as binary, in Postgres. UUIDs are stored as binary, and they’re relatively niche - there’s no postgres implementation of ulids, for example. ULID can be stored in UUID columns, but isn’t valid as a UUID.
  • UUID v7 looks like it checks every box, but it’s not fully standardized or broadly available yet. The JavaScript implementations are great but have very little uptake, and Postgres, both by default and in the uuid-ossp module, doesn’t support it.

So for the time being, what are we to do? I don’t have a good answer. Cross our fingers and wait for uuid v7.