Object Id

Monday, 2 July 2001
Does a modern database need the third normal form?
  • Surely you can identify each row uniquely with an object id? That coincides with Object Oriented software, where every object is unique too.
  • An object id is a nice and standard way of identification. Very easy for generic software.
  • And an object id is good for performance as a join spans just one column.
Yes, well, how wonderful this all sounds. And yet... some remarks.

Need the Third Normal Form?

Yes! Yes! Yes! You do need the third normal form, for every database design.

The database designer must perform data analysis, even when using object ids. Object ids are just a possible technical implementation, but please let it be of an working logical model. Object ids are not a valid excuse to just skip data analysis. Behold: you need it for an object model too. Object model and database design are not identical twins. The are just half-brothers with a common parent: a conceptual model (for example an ERD).

Every object unique by own object id

Yes. In an OO world every object is unique, just like it is in the real world. However this is a fake reason, an irrelevant argument.

It is part of the real world too to avoid ambiguities.

  • Any given tax number belongs to only one person, never 2.
  • For any given combination of tax number and newspaper name there is only one subscription.
Smalltalk offers two wonderful tools to avoid double occurrences: a Dictionary and a Set. In Java a Hashmap will do fine. And guess what makes an object unique in a Set, Dictionary or Hashmap? Exactly, the keys from the ERD!

With object Ids the database looks just like an objectbase, but is that important? Thesis: Object Ids are just good for internal usage in the object layer. Please keep them hidden from the user interface and exchanges with other systems.

Object ids are so elegantly generic

Yes, how beautiful. So, what is it good for? Why is it an advantage to have something generic? If being generic truly is such an advantage, please take it one step further: Create a generic table with the following columns:
entity name (key) object id (key) column name column value
... ... ... ...
Have fun with the programming effort and the performance!

Fast joins

Hmm. For performance no join is much better than a fast join.

And how do you avoid joins? Have a look at denormalisation. In third normal form keys are denormalised in a very natural way, available without join. And it is exactly this key information that often has the highest frequency of usage, so the biggest impact on the performance.

Yes, yes, you can still denormalise that key information. And yes, you can add an extra index for uniqueness. So, where is the advantage of the object id for the performance??

Till next week,
Nut

Saturday 30 June 2001: Suzanne Piet passed her C diploma in swimming. Congratulations!