Good database design is crucial to obtain a sound, consistent database, and — in turn — good database design methodologies are the best way to achieve the right design. These methodologies are taught to most Computer Science undergraduates, as part of any Introduction to database class. They can be considered part of the “canon”, and indeed, the overall approach to database design has been unchanged for years. Should we conclude that database design is a solved problem?

The problem of database design is difficult, and it encompasses issues that may not be amenable to formalization. Hence, any method is likely to have some limitations and drawbacks. However, this is not a reason to ignore the serious problems that the traditional approach is running into:

  • the traditional approach is not followed in practice;
  • we ask practitioners to follow a model that is demanding and yields, in return, some very limited results.
Why does it fail us? Because we make the following assumptions:
  • Users are faceless objects for whom the systems are designed.  It’s “everything for the people, but without the people”. (IT people, like communists, love central planning.)
  • The information system is strongly consistent. (There is no such thing in practice.)
  • Our semantics is absolute. There is a single valid point of view. (Maybe if you live in a cave…)
  • Our models are static. Changes are uncommon. (Nonsense. The world is changing at a break-neck pace.)
Instead, data design methodologies should encourage us to design for
  • a distributed world and
  • imperfect knowledge.

What do you think?

Further reading: I wrote a long-form paper on this topic with Antonio Badia: A Call to Arms: Revisiting Database Design, SIGMOD Record 40 (3), 2011. ACM makes the PDF freely available. We back all our observations with  references.

4 Comments »

  1. “IT people, like communists, love central planning.”

    I love this quote because it’s true. I am consistently having to dissuade people from coming up with a centralized service that controls other services. That path is fraught with danger.

    I have put your paper on my reading list with high priority.

    Comment by Geoff Wozniak — 23/10/2011 @ 13:24

  2. @Geoff Thanks for the good words.

    Comment by Daniel Lemire — 23/10/2011 @ 16:02

  3. For what it’s worth, I enjoyed the paper. I think the themes apply to programming in general and I’m working on a post on the matter.

    Comment by Geoff Wozniak — 4/12/2011 @ 17:57

  4. @Geoff Please keep me posted.

    Comment by Daniel Lemire — 4/12/2011 @ 18:27

Leave a comment

Warning: When entering a long comment, please ensure that you make copy of your text prior to submitting it. If the server should fail or if you hit a bug, you might lose your work. I am not responsible for your lost effort.

To spammers: I carefully review every single post and make sure that spam gets deleted. You are wasting your time if you are manually entering spam using this form. Read my terms of use to see what I consider to be abusive.

Example: duo plus septem is '9'. The numbers are expressed in latin numerals but you should give your answers using ordinary digits.

 

« Blog's main page

Powered by WordPress