Rails 4 UUID primary keys with PostgreSQL

JasonPunyon · on April 27, 2013

(Caveat: I know this about SQL server, not Postgres)

If you use random UUID's (as opposed to sequential UUID's) for your primary key your database will spend an extra hunk of time on reordering your PK index on disk on insert. This bit us at Stackoverflow. So remember: just because you can do something doesn't mean you should.

tszming · on April 28, 2013

Yes, what you said also apply to MySQL/InnoDB and MongoDB. Someone did an experiment with MongoDB some times ago for various id schemes: http://i.imgur.com/clm9D.png (https://groups.google.com/d/topic/mongodb-user/1gPqVmFHExY/d...), I don't know if Postgres is being different but people should do their own testings instead of blindly listening to advices from the Internet.

jashkenas · on April 28, 2013

What's the advantage of a sequential UUID over an auto-incrementing integer?

(Keeping in mind that both can be started at N on different shards, or prefixed.)

axelfontaine · on April 28, 2013

You can merge any two data buckets, without any risk of id collision.

labria · on April 27, 2013

Yepp, mentioned it here: http://labria.github.io/2013/04/28/rails-4-postgres-uuid-pk-...

ryeguy · on April 28, 2013

No you didn't. Not fully at least. OP is saying that there is a penalty just for doing the inserts themselves.

jshen · on April 27, 2013

Just use a k-ordered uuid.

tenderlove · on April 27, 2013

I commented on the OP's article, but do this for the pk:

    create_table(:users, id: :uuid) do |t|
      t.string 'name'
    end

And to make sure the extension is enabled, (in your migrations) do:

    enable_extension 'uuid-ossp'

For now, make sure to use SQL schema dumping. I don't think Ruby schema files will work (as I haven't written a test for it yet).

EDIT: I forgot to mention, you can enable hstore by using "enable_extension('hstore')"

floormat · on April 27, 2013

Sorry for the up coming stupid question but...

What is the value of this? Why can't unique identification be done using just regular increments on an ID column? Or even a composite key?

radicalbyte · on April 27, 2013

As pallinder said, it can be very handy: the IDs can be generated by the nodes, not the db server. Very useful in disconnected environments. Imagine being able to create data on a smartphone whilst sitting on the plane, and not having to do anything messy with ID replacement when you sync with the server in the office.

The cost? The keys are larger, and (unless using a sequential algorithm) are poor candidates for clustered keys (because they force page splits). The impact can be rather large (this lead to terrible performance in early versions of Sharepoint).

jacques_chester · on April 27, 2013

To be clear, MAC-based UUIDs are sequential. PRNG (v4) UUIDs are random and don't work well with indexes.

bdon · on April 27, 2013

It's also useful if you don't want your URLs to expose how many of a certain thing you have, whether it be users, posts, payments etc. A lot of sites let you derive how much activity they have based on how fast their numeric IDs increment. You could use a separate token alongside the pkey to do the same but this feature just makes it simpler.

__Joker · on April 27, 2013

Along the lines

http://www.wired.com/autopia/2010/10/how-the-allies-used-mat...

http://www.jackofallclouds.com/2009/09/anatomy-of-an-amazon-...

labria · on April 27, 2013

Only downside: this makes the URLs very long sometimes.

pallinder · on April 27, 2013

Imagine a distributed system where you want to preserve uniqueness across the board. Using a uuid more or less (by the sheer number of possibilities) guarantees that this will be the case.

cheald · on April 27, 2013

UUID v1 uses the MAC address of the computer doing the generation as a part of the UUID, which ensures uniqueness so long as you aren't cloning MAC addresses in your infrastructure.

The downside of this is that it can leak information about the machine that generated the UUID, but if you require deterministic uniqueness, there you go.

zimbatm · on April 27, 2013

Another not mentioned here: sharding becomes much easier. You don't need a central authority that controls the increment.

labria · on April 27, 2013

At the DB level, it facilitates master-master replication set-up, eliminating the auto_increment collision problems. Master-master replication, in turn, allows building distributed applications that can handle net splits reasonably well.

zimbatm · on April 27, 2013

I think the primary_key line should be:

    t.primary_key :id, :type => :uuid

labria · on April 27, 2013

It's actually: t.primary_key :id, :uuid

labria · on April 27, 2013

Sorry, fixed that.

netghost · on April 27, 2013

That makes a lot more sense, I was staring at the example, and then just figured I'd have to try it later.

hiddenfeatures · on April 27, 2013

I was really wondering where the difference to the normal syntax was. Thanks for the clarification

jashmenn · on April 28, 2013

If you'd like to use UUIDs in Rails 3, I've been maintaining the activeuuid plugin for a while now:

https://github.com/jashmenn/activeuuid

labria · on April 27, 2013

See my second, more detailed post here: http://labria.github.io/2013/04/28/rails-4-postgres-uuid-pk-...

joseph4521 · on April 27, 2013

Or just use Sequel already, works with all versions of Rails (and without Rails too).

flog · on April 27, 2013

Now if only I could get this working with Postgres.app

pg_bot · on April 27, 2013

Running on rails 4 it should be:

   $ rails g migration add_uuid_to_postgresql

   class AddUuidToPostgresql < ActiveRecord::Migration
     execute("CREATE EXTENSION uuid-ossp;")
   end

   $ rake db:migrate && rake db:test:prepare

Then create a table as shown in the article, and it should work.

Legion · on April 27, 2013

For this to work, the PG role that the Rails app is using has to be a superuser, as AFAIK only superuser roles can execute CREATE EXTENSION.

Does anyone consider this an issue? I have been using non-superuser roles within my Rails apps, and using an outside superuser role to add extensions with an external tool (like pgAdmin).

labria · on April 27, 2013

Yes, that's exactly I didn't include it in the migration in the example.

wtn · on April 27, 2013

There's a shortcut in the Active Record PostgreSQLAdapter:

  enable_extension 'uuid-ossp'

scotth · on April 27, 2013

Can you use find_in_batches or find_each with UUIDs?

pg_bot · on April 27, 2013

Of course, both of those methods are members of ActiveRecord and will work regardless of how you filter your records.

ryeguy · on April 27, 2013

But don't those methods rely on the sequential ordering of auto incrementing ids?

pg_bot · on April 27, 2013

After reading the documentation more clearly. http://edgeapi.rubyonrails.org/classes/ActiveRecord/Batches..... It looks like there may need to be a patch for this feature, however the batching process forces the database to make a query in Ascending order for the primary keys so as long as you can generate new UUIDS in ascending order you should be fine. However problems look like they will occur if you generate a new record that has a UUID that would be inserted randomly between two numbers in the UUID range. I will see if I can bang something out and issue a pull request.

w-ll · on April 27, 2013

I was dealing with a similar issue on a project that all id's are UUID's because I wanted non-sequential id's.

I found it was just as simple of selecting ranges based of their date_created attribute.

dickeytk · on April 28, 2013

No because ActiveRecord will try and do > uuid comparisons that won't evaluate.

Also, if it's not sequential, you may miss records while iterating through a collection.