What will be the next Web? A prediction

Gutenberg’s printing press was a major technological advanced that made it possible for the (initially quite rich) commoner to publish affordably. Books became affordable “soon” after and knowledge could spread further, faster and more accurately.

The Web has had comparable effect: it allows me to publish the content you are reading right now. I can reach thousands of people daily (and I do) at a minimal cost.

What these things have in common is that they made something that was once only available to very rich institutions, available to everyone. This is when meaningful revolutions happen. This is when technology people win, this is where technology researchers have to be.

Since the Web came about, we’ve been looking for the next big thing. The technology that would create a “boom”. We need “booms” at regular interval because, otherwise, technology people (like myself) get marginalized.

As an example, I did my Ph.D. on wavelets. Wavelets were hot, and poorly understood. Information about wavelets was scarce. There was a lot of hype. Now? Now wavelets are commonplace and free or inexpensive wavelet software and hardware is available. There is no urgent need for wavelet experts anymore.

So, what are the next “booms”? I don’t need to find them all, just aim approximatively right. First, it was going to be nanotechnology which we see more now as an incremental advance in pharmaceutical technology. We’ve seen Web Services, Intranets, Semantic Web come by… People have been searching for the “next Web” or “Web 2.0”.

  • Though useful, Web Services are not the next big breakthrough. They are just distributed computing reinvented with the same limitations, the same hype.
  • Semantic Web is not the next big breakthrough. For the most part, it is AI techniques that didn’t work before, but that we are porting to the Web hoping that they will now work. They won’t work any better.
  • Collaborative filtering and personalization is not the next breakthrough though I feel this is where the most interesting R&D is happening right now on the Web. However, it will play an increasingly important role as the next revolution happens. [See some of my work on collaborative filtering.]

So, what will be the next breakthrough?

I bet it is going to be ubiquitous massive storage. Very soon, in 5 years, we will reach the point where individuals will have access to infinite storage. Note that I didn’t write infinite bandwith or infinite computing powers. Now, both bandwidth and CPU cycles will remain limited for the forseeable future.

But the applications are not here yet. There are very few useful applications that can leverage freely available infinite storage.

What problems will we be facing?

  • Processing extremely large data sets with limited bandwidth. That’s going to be a huge problem and Google is just the first big success story… but when everyone can store as much data as Google has… the problem becomes huge. Smart indexing and aggregating techniques are going to become extremely important. When I can record my entire life and the life of my kids, and all the transactions I ever made, how do I search and summarize such data? How do I find out automatically how my work has been spread out and how do I assess automatically my productivity. We need to bring data warehousing to the masses.
  • Security and confidentiality: very soon, even smaller stores will be able to record absolutely everything about their clients. Even the tiniest details. In Canada, we have a law to protect us, but what if you have a PDA which records you entirely life and you lose it? Is someone able to take over your life? He sures has more information about you than what you can remember, so who is the real you? We need to bring Enterprise-class security to the masses.
  • Social software is going to grow to new heights. Smart people will find ways to leverage infinite storage to create amazing collaborative working environments. I bet collaborative filtering will play a huge part in this. Want to find cool music? Use indiscover.net or webjay.org. These sites are only the beginning of what we can do with infinite storage (and they are storage limited). Wikipedia is much closer to what the future of social software is. We need to move all social software to the Wikipedia level and beyond. The social software of the future will be based on inexpensive software, and inexpensive infinite storage.

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

13 thoughts on “What will be the next Web? A prediction”

  1. Hello Daniel!

    It’s sure that infinite storage will change things but will this really be the next big thing? It’s a support for a future infrastructure. What will be the big thing: the support in itself or the thing that use this support? Internet, the mass of information that compose this Internet will always grow and grow faster with this infinite storage thing. But, we already have many problems to handle this mass of information. We have problems to store them (not in term of storage space) and we have problems to search them. So, will the next big thing be the thing that will help us to effectively search through this mass of information?

    If the answer is yes, then I thing that it’s a little bit fast to say that semantic web could not be the next big thing.

    Have a good day!



  2. Thanks Seb. You always have great links.

    Fred: what was important in Gutenberg’s time… the printing press, or the books? I think trivially, it is not the object itself that matters, but what it will bring. I could be wrong, but I think that affordable infinite storage has little to do with the Semantic Web which itself has little to do with searching. Semantic Web is about AI applications, not Information Retrieval. If the Semantic Web was about searching, then Google and Yahoo Overture would be making use of Semantic Web ideas, but they don’t. There isn’t a single idea in the Semantic Web that can help me browse through 200 GB of images. There is research in this topic, but it isn’t part of the Semantic Web research. But fair is fair, I want people to disagree with me, otherwise, I might have stated the obvious…

  3. It’s true that the revolution was the printing press, no doubts. But the interrogation I have is: what’s the utility of such spaces if we can’t use effectively what is contained in this space?

    We need a way to search, to classify, and to have access to what we really need. We don’t want to spend 20 minutes for a simple search or 1 hour for a longer one. The most information we have, the longest the search will be and probably the less relevant the results will be.

    Given this, what will be the revolution, the space or the infrastructure supporting it?

    I can also be wrong; I only throw ideas as they come; I try to cope with your point of view.

    For the semantic web point, can we agree that Semantic web is only a way to classify information? The name says it, to give semantic, meaning, to resources. The languages used have been developed to help machines to understand the meaning, the semantic, of these resources. This specific attribute, machine understandability, can, eventually, be used by some artificial intelligence programs as a knowledge database. No? Semantic Web theories can be helpful for AI applications, but, tell me if I’m wrong, but the first goal of SW was not to serve as a simple AI application but to give meaning to things, to help users in their day-to-day works.

  4. In order to mark things semantically, you don’t need Semantic Web… See, to me, this is marked semantically:

    The [geography]river[/geography] is [color]green[/color].

    It has nothing to do with the Semantic Web. People have been using Semantic Markup for years… think DocBook. Semantic Web is not about data, it is not about semantic markup… Don’t believe me? Search for intersection between database research and Semantic Web… there is hardly anything. SW aims to represent knowledge. It is a very different goal, at least from a philosophical point of view.

    For the Semantic Web, whether you have 1 GB or 10000 GB of data, there is very little difference… precisely because it doesn’t care about data, only about knowledge…

    The Semantic Web is based on RDF. It takes RDF as a foundation. RDF is not about semantic markup, but about representing knowledge using graphs, and eventually, building ontologies. I’d say most of Semantic Web activities have to do with ontologies or ontology-related issues.

    “Giving meaning to things”, that’s what ontologies are… But it doesn’t help find things, per se. It is an approach, a vision, a form of philosophy. It is a way to write down knowledge. What do you do with knowledge if you want to share it? Well, you write an article using English. Some people say that this is good enough for humans, but we want machines to be able to consume this knowledge, so you need to express it formally… but can knowledge be expressed formally? What is knowledge?

    So you see, Semantic Web is not about data, it is about knowledge representation.

    Here’s the difference… The SW approach to managing 100 GB of MP3 would be to express relationships between the music and an ontology of music… that is, a formal description of what we know about music… maybe it would say that such a song is a form of jazz and so on.

    But if your problem is that you want to quickly browse through your MP3 collection to find what you listened to 10 months ago… well, SW can’t help you there.

    No… the revolution won’t be, I think, massive storage per se. It will be what we will do with it. Just like the Web wasn’t about the Web, but about what we could do with it.

    Or maybe I got all of this wrong… 😉

  5. Hello Daniel!

    Thank for the scratch course on SW 

    Semantic Web is the Web view semantically. The semantic is expressed with languages based on RDF and RDFS like OWL (for future compatibility). RDF is a suggestion, adopted, for the moment, by the community. Eventually, other languages could be developed and OWL ontologies would be, eventually, be able to import these future ontologies, wrote in another language, with some sort of transformation process.

    But as I see SW, with the newcomer’s eyes, defining meaning to resources could help us in the search process. The search time I was talking about is not the “hardware” latency time, but the time a normal person would spend to search what he really need; relevant resources. For this tasks, SW would be able to help us; interacting as a translator between humans and resources, to bring relevant ones with the meaning hidden in the user’s query string.

    “No… the revolution won’t be, I think, massive storage per se. It will be what we will do with it. Just like the Web wasn’t about the Web, but about what we could do with it.”

    I agree with this assertion 



  6. Web 2.0 Weekly Wrap-up, 21-27 Mar 2005
    This week: ETech notes, Yahoo love-in, Web 2.0 acquisition deals continue, Hacking Web
    2.0, ubiquitous storage.

  7. Regarding comment #5, I disagree. RDF is just one of many possible ways of describing knowledge and meaning. Your tagged example is another. However “semantic” is defined as “Of or relating to meaning, especially meaning in language”. Meaning and knowledge are closely related. When I express knowledge in RDF, what I am expressing is properties of an entity that define it’s meaning. This is the same in your example: “color” is just a property describing the type of the entity “green”. The same could just as well be expressed with an RDF triple with a URN for green as the subject and a URN as green for “color” and a URN for whatever you would describe their relationship as as the predicate.

    So in other words, what you presented is just another way of representing the data, meaning and knowledge that the Semantic Web initiative is working on a _uniform_ representation of that is suited to machine reasoning.


  8. Vidar: This is what I’m saying: “the Semantic Web initiative is working on a uniform representation of that is suited to machine reasoning”. That is, the Semantic Web is not about semantic markup. It is about knowledge representation for AI. So, whether you believe the Semantic Web will take off or not depends very much on whether you think expert systems, rule engines and so on, will take off. The fact of the matter is that machine reasoning has been around for 30 years, having it running off the Web doesn’t change its nature. Machine reasoning is not about managing massive (infinite) quantities of data. It is about AI, having intelligent computers…

  9. Pingback: Anonymous
  10. I’m a couple of weeks late, but I’ll chime in anyway. Qualitatively greater amounts of storage, network bandwidth and processing power will be enablers of Big New Things, but I don’t think they will be commonly regarded as big new things themselves. I think this is consistent with your reasoning: ubiquitous massive storage will be enabler of things such as the three you suggest at the end of the original post.

    I have a quibble with the first two of those things. I’ll be surprised if the processing of massive datasets ever gets much popular mindshare. Even if a newfound ability to aggregate, mine, and otherwise process data on a massive scale brings change of profound significance, I suspect that it won’t be thought of as a Big Thing in the same way that the Web was. It may be part of the playing field, but it isn’t the game.

    As for security and confidentiality, I don’t think we have to wait for ubiquitous massive storage before we declare a state of crisis. It’s here now! But again, I think it’s the playing field, not the game.

    I think the third — the rise of social conciousness in cyberspace — is right on. Of course, it’s hard to be more specific. The one thing I’m fairly sure of is that the Next Big Thing will be familiar — it won’t be part of some alien new world. It will be a reflection of what people have been for a long time. What is important to people? Mainly, we communicate with each other. We communicate useful get-through-the-day facts and longer range planning. We gossip and small-talk to maintain or strengthen social relationships. And we produce and consume art to fulfill some deeply ingrained need to find resonance with other people. (Oh yes, and pornography, which is kind of in a class by itself.) So far, the big things in IT have all been direct reflections of those social needs: the Web, e-mail, instant messaging, cell phones, Napster/KaZaA, Skype, “social networking”, iTunes, video-on-demand, etc. I expect this web of communication to mature into something in which reputation and recommenation are pervasive — in a way that mirrors practices that we are already comfortable with, but with dramatically increased efficiency and/or accessibility. The open question for me is whether the increaase in efficiency or accessibility will be sufficient to have an impact approaching that of Gutenberg’s press.

Leave a Reply

Your email address will not be published.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You may subscribe to this blog by email.