JPA and UUID Primary Keys

For a recent project, I decided to use UUIDs as the primary key for my tables. The pros and cons of this decision have been debated to death already and are not worth repeating. However, once I decided to use UUID there were a couple of options as to how I would store them in the database.

A UUID in its binary representation is 16 bytes. This is the most efficient way of storing a UUID – half the size of its text representation (32 bytes after removing dashes). To work with the binary representation in Java we need to deconstruct the ID out of the UUID class and into a byte array. Note that not all JPA providers can have a byte array as the primary key – Hibernate had no problems with it, but OpenJPA balked. The following block shows the JPA annotations for creating the field:

	@Id
	@Column(columnDefinition = "BINARY(16)", length = 16, updatable = false, nullable = false)
	private byte[] id;

Also note the BINARY(16) column definition; I am using MySQL as the database, and defining a column as BINARY tells MySQL to compare each character directly instead of using character sets – this will be faster than interpreting the column using UTF-8 when doing comparisons

Now that we have the field defined, we need some code for generating the UUID and serializing it to a byte array. We also need functionality for getting a String representation of the UUID, whether we just generated it or we got the bytes from the database.

	@Transient
	private String uuid;

	public String getId() {
		if (this.id == null) {
			UUID u = UUID.randomUUID();
			ByteBuffer bb = ByteBuffer.allocate(16);

			bb.putLong(u.getMostSignificantBits()).putLong(u.getLeastSignificantBits());
			this.id = bb.array();
			this.uuid = u.toString();
		} else if (this.uuid == null) {
			ByteBuffer bb = ByteBuffer.wrap(this.id);
			UUID u = new UUID(bb.getLong(), bb.getLong());
			this.uuid = u.toString();
		}

		return this.uuid;
	}

This function needs to be called at least once before an object is persisted to generate the ID; I placed it in a @PrePersist method myself. I rolled these functions (as well as some others for maintaining created / update timestamps) into an abstract base class that all of my entities extend.

  1. Hi, nice job. But “columnDefinition” is not portable! Is there a way to rewrite this like that will be usable for all rmdbs…

    • If you have to be database-independent then just leave off the columnDefinition. Note that since it is binary data and not a valid string, storing it in a CHAR column might make the database angry depending on the character set that is defined for said column; I guarantee that invalid UTF-8 sequences will be generated. Annotating it with @Lob might help here, but my understanding is that @Lob just indicates large amounts of data, not necessarily binary

      • When I do this way (remove “columnDefinition”), hibernate gave me an error:

        ERROR org.hibernate.tool.hbm2ddl.SchemaExport – Unsuccessful: create table Person (id tinyblob not null unique, version bigint, identifier integer, name varchar(255), surname varchar(255), primary key (id)) ENGINE=InnoDB

        ERROR org.hibernate.tool.hbm2ddl.SchemaExport – BLOB/TEXT column ‘id’ used in key specification without a key length

  2. Hi Robert

    Thanks for sharing – I’ve found this to be very useful. Quick question – what strategy do you use for querying from the database by id? I’m having issues with EntityManager.createQuery when querying based on the id (which is binary).

    Jason

    • How are you distributing the primary key? I encode it using Base64 (in commons-codec) which encodes a string that can be passed around (even in urls).

      You can then decode this back into a byte array and use it to find the entity:

      TestEntity te = em.find(TestEntity.class, Base64.decodeBase64(“BASE64 STRING HERE”));

      or

      TestEntity te = em.createQuery(“SELECT x FROM TestEntity x WHERE x.id = ?1″, TestEntity.class).setParameter(1, Base64.decodeBase64(“BASE64 STRING HERE”)).getSingleResult();

  3. Thanks Robert – got it figured out. To answer your question, we distribute the IDs as UUID Strings

    EntityManager().find(TestEntity.class,convert(id))

    public static byte[] convert(String uuidAsString) {
    UUID u = UUID.fromString(uuidAsString);
    ByteBuffer bb = ByteBuffer.allocate(16);
    bb.putLong(u.getMostSignificantBits()).putLong(u.getLeastSignificantBits());
    return bb.array();
    }

  4. Hi Robert

    As I’m digging more into this, you may want to use a UUID generation strategy that is not random – perhaps a time-based strategy. As data is inserted into the database, you want to have the primary keys to be incrementally increase to avoid paging issues. “High Performance MySQL”, by Schwartz et al does a good job in explaining why.

  5. Hi, I use the byte[] as a primary key in my project too.
    But i got some problem with remove the object.
    How do you deal with it?
    Thanks.

  6. Hi,
    I am using @Id as byte[] but when I do findUserByUserId(byte[] id) in Spring I am getting exception as “cannot convert [B] to [java.lang.Object]”
    Thank you.

Leave a Reply

%d bloggers like this: