How information technology is really built

One of my favorite stories is how Greg Linden invented the famous Amazon recommender system, after after being forbidden to do so. The story is fantastic because what Greg did is contrary to everything textbooks say about good design. You just do not bypass the chain of command! How can you meet your budget and deadline?

In college, we often tell students a story about how software and systems are built. We gather requirements, we design the system, we get a budget, and then we run the project, eventually finishing within budget and while respecting the agreed upon time frame.

This tale makes a lot of sense to people who build bridges, apparently. It not like they can afford to build three different bridge prototypes and then ask people to choose which one they prefer, after checking that all of them are structurally sound.

But software systems are different.

Consider Facebook. Everyone knows Facebook. It is a robust system. It serves 600 million users with only 2000 employees. Surely, they are excessively careful. Maybe they are, but they do not build Facebook the way we might build bridges.

Facebook relies on distributed MySQL. But don’t expect any 3 Normal Forms. No join anywhere in sight (Agarwal, 2008). No schema either: MySQL is used as a key-value store, in what is a total perversion of a relational database. Oh! And engineers are given direct access to the data: no DBA to preserve the data from the evil and careless developers.

Because they don’t appear to like formal conceptual methodologies, I expect you won’t find any entity-relationship (ER) diagram at Facebook. But then, maybe you will find them in large Fortune 100 companies? After all, that is what people like myself have been teaching for years! Yet no ER diagram was found in ten Fortune 100 companies (Brodie & Liu, 2010). And it is not because large companies have simple problems. The average Fortune 100 has ten thousand information systems, of which 90% are relational. A typical relational database has between 100 and 200 tables with dozens of attributes per table.

In a very real way, we have entered a post-methodological era as far as the design of information systems is concerned (Avison and G. Fitzgerald, 2003). The emergence of the web has coincided with the death of the dominant methods based on the analytic thought and lead to the emergence of sensemaking as a primary paradigm.

This is no mere coincidence. At least, two factors have precipitated the fall of the methodologies designed in the seventies:

  • The rise of the sophisticated user. These days, the average user of an information system knows just as much about how to use the systems than the employees of the information technology department. The gap between the experts and the users has fallen. Oh! The gap is only apparent: few users even understand how the web work. But they know (or think they do) what it can do and how it can work. Yet, we continue to see users as mere faceless objects for who the systems are designed (Iivari, 2010). The result? 93% of accounts are never used in enterprise business intelligence systems (Meredith and O’Donnell, 2010). Users now expect to participate in the design of their tools. For example, Twitter is famous for its hashtags which are used to mine trends, and which are the primary source of semantic metadata on Twitter. Yet did you know that they were invented by a random user, Chris Messina, in a modest tweet back in 2007? It is only after users started adopting hashtags that Twitter, the company, adopted it. Hence, Twitter is really a system which is co-designed by the users and the developers. If your design methodology cannot take this into account, it might be obsolete. Recognizing this, Facebook is not content to test new software in the abstract, using unit tests. In fact, code is tested during the deployment for user reactions. If people react badly to an upgrade, the upgrade is pulled back. In some real way, engineers must please users, not merely satisfy formal requirements representing what someone thought the users might want.
  • The exploding number of computers. According to Garner, Google had 1 million servers in 2007. Using cloud computing, any company (or any individual) can run software on thousands of servers worldwide without breaking the bank. Yet Brewer’s theorem says that, in practice, you cannot have both consistency and availability (Gilbert and Lynch, 2002). Can your design methodology deal with inconsistent data? Yet, that is what many NoSQL database systems (such as Cassandra or MongoDB) offer. Maybe you think that you will just stick with strong consistency. JPMorgan tried it and they ended up freezing $132 million and losing thousands of loan applications during a service outage (Monash, 2010). Most likely, you cannot afford to have strong consistency throughout without sacrificing availability. As they say, it is mathematically impossible. Brewer’s theorem is only the tip of the iceberg though: what works for one mainframe, does not work for thousands of computers. Not anymore than a human being is a mere collection of thousands of cells. There is a qualitative difference in how systems with thousands (or millions) of computers must be designed compared with a mainframe system. Problems like data integration are just not on your radar when you have a single database. We have moved from unicellular computers to information ecosystems. If your design methodology was conceived for mainframe computers, it is probably obsolete in 2011.

Building great systems is more art than science right now. The painter must create to understand: the true experts build systems, not diagrams. You learn all the time or you die trying. You innovate without permission or you become obsolete.

Credit: The mistakes and problems are mine, but I stole many good ideas from Antonio Badia.


15 thoughts on “How information technology is really built”

  1. @jld

    Not sure what you mean. This blog post is right in line with a lot of things I have been writing about for a long time, including NoSQL, recommender systems and the failures of formal conceptual design.

  2. @Justin

    “Methodology” is a tool, and like any tool has to be applied sanely.

    Absolutely. And this means rejecting a methodology if it fails you.

    I have a feeling that the examples are not universally applicable. (…) I really wouldn’t want lost updates, data races et al in my banking or medical informatics application.

    Working for a bank is very different than working for Facebook. But I submit to you that banks are also susceptible to disruptive innovation. They have to be conservative, but they have to keep up with the pace of technology and offer things like micro-transactions, otherwise others will.

    One of the reason I chose the bank I am with is that they have a decent web site, with cool features that other banks did not implement, possibly because they are too conservative. This must be a trade-off for my bank: each one of these features is a security risk.

    In any case, banking systems are largely built around eventual consistency, not strong consistency. If someone pays someone else, but it takes time to check whether the payment can actually be made, you record the transaction (optimistically) but flag it as frozen for a time. Optimistic transactions are just one way to deal with possible inconsistencies. Several NoSQL engines use this model.

    Moreover, a lot data recorded by banks is of lesser importance. Surely, a banking system needs a web log to detect fraud. But does this web log need to be ACID? Probably not.

    A good engineer working for a bank would identify the critical paths and prove that the system cannot leak any money over time. However, you cannot not have exceptions. So you probably want to build a system that can respond to exceptions without freezing everything. Moreover, the good engineer would also identify the right accessibility/consistency trade-off. He would not require ACID compliance of everything.

    As for the medical systems, at least in Canada, they are already quite a bit lossy. If you undergo a test, there is little check to make sure it ever makes it to your doctor. Papers move around, but mistakes are made all the time. An eventually consistent electronic system would be far superior to the current setup.

    Let me offer you a scenario. A laboratory has uploaded a bunch of tests to your record, which includes a financial transaction. For strong consistency, there must be a financial transaction for each medical test. Yet there is a problem with your bill and the system refuses it (maybe your billing account was wrong). Do you want to abort the whole transaction? This would only allow your doctor to have access to the data when you billing account would be back in good standing. Possibly, this could delay a diagnostic and result in your death. Surely, that is not what we want. So ACID compliance is not required, not for the entire system.

    So a good engineer working on medical systems would differentiate between the critical operations and the non-critical ones.

  3. I have a feeling that the examples are not universally applicable.

    IMHO, Facebook can trade-off consistency because the value of the data is not critical. I really wouldn’t want lost updates, data races et al in my banking or medical informatics application.

    “Methodology” is a tool, and like any tool has to be applied sanely.

  4. Agree partially, we could also add the level of uncertainty into which many web companies are involved when re-defining the business along with the wind’s change, as the lack of knowledge about the user, I mean compared to traditional known and more stable user requirements over intranet systems. So if we focus only facebook and youtube then we’ll have a very biased perspective of the whole IT situation. Smaller teams have better sinergy and less risks than bigger teams, where structured and conservative approaches attempt to handle the situation.

  5. @Bob

    There’s a whole literature on this kind of thing under the headings extreme and agile programming.

    True. Agile programming is closely related. But I never paid much attention because I think it is more of a social awakening than an intellectual discovery. Was any good software developer unaware of the ideas expressed by the Manifesto for Agile Software Development? I think people have been doing agile programming well before the term was coined. I certainly was.

    I was maybe most inspired by Fred Brooks who is probably the person who spent most time writing and talking about real-world system design. His latest book is well worth the price tag. You can see my short review here:

    PS: Could I write “decem” as the response to “Sum of ûnus plus novem?”? I played it safe and went with “10″. Being able to write “X” would be cool, too.

    Maybe for version 2.0.

  6. There’s a whole literature on this kind of thing under the headings extreme and agile programming. The so-called software engineers (aka architecture astronauts) have an even larger competing literature on process.

    Even for a large project, the biggest reason to not front-load design too much is that a little prototyping can go a long way in clarifying a design and reducing risk by evaluating how some things can work in a more realistic setting than a whiteboard or UML diagram.

    The second biggest reason is so that management can see something concrete and not cancel you and/or the project before it gets off the ground. There’s no argument for feasibility like a demo, even if it’s a prototype.

    PS: Could I write “decem” as the response to “Sum of ûnus plus novem?”? I played it safe and went with “10”. Being able to write “X” would be cool, too.

  7. @Jakob

    But they have also been created this way before!

    True, but the question I ask is why are they created this way? We have formal methods that should work. Why do they appear to fail? They appear to fail more dramatically since the end of the nineties. Why is that? I identified two elements which help answer partially the question: sophisticated users and distributed computers.

    “Post-methodological” sounds like just playing around without thinking. (…) But I strongly doubt that good designers do not have methodology. It needs creativity *and* expertise in methods to build great systems.

    People will always use some methodology to get work done. What Avison and Fitzgerald meant is that there is now a great diversity of methodologies, including informal ones. Certainly, some organizations (e.g., Amazon or Facebook) have their own methodologies.

  8. Thanks for the interesting summary. Yes it is right to stress how systems are actually created. But they have also been created this way before! Agile programming is just one example. About the general limits of formal conceptual methodologies you are right too, but often some more informal conceptual methodologies would still help. “Post-methodological” sounds like just playing around without thinking. To some degree this is also important to experiment. But I strongly doubt that good designers do not have methodology. It needs creativity *and* expertise in methods to build great systems.

  9. Clearly, in human processes the last word will never be pronounced, we always need to customize our own ways, we’re not a machine with a behaviour to standarize. There’s no final methodology. Is so wrong to think of working without organization, as is wrong to completely follow a methodology, this looks pretty absurd to debate.
    We’re just talking about how trends and experience leads us to smoothly evolution, cause we dont buy flying pigs anymore..

  10. I’m not ready yet to predict the demise of methodology, but I can say that personally, I have always preferred “craftmanship” to methodologies.

    The interesting thing about craftmanship is that:

    * It can be formalized into a bunch of simple rules and tricks
    * But it is not as monolithic as a methodology. You can cherry pick the tricks that are appropriate for you.

    This is why I really like Agile “methodologies”. Extreme Programming in particular, is more a collection of “habits of highly successful software teams” than it is an actual methodology.

Leave a Reply

Your email address will not be published. Required fields are marked *