Many democratic systems require vote diversity. You do not get elected prime minister of Canada by rallying the largest number of voters. You also need to have your votes spread out over several regions.

Similarly, Scott Karp argues that completely open social networks fail. He takes two examples: Digg and Wikipedia.

Digg recommends web sites based on user votes. They recently modified their algorithm:

The algorithm change effectively holds back from the homepage any story that is Dugg by the same groups of friends, i.e. a group that is not “diverse,” (…)

As for wikipedia, Karp points out that it is not a really open system since a group of editors have a great deal of control.

Stephen Downes asks an interesting question: what constraints make a network effective?

The wisdom of crowds is not obtained by mere voting. What is required — as the new Digg algorithm explicitly recognizes — is diversity.

I would like to formalize this problem. You are given a set of users and their votes on several issues as in the Digg community. You are not given out explicitly what the cliques — or set of friends — are. Is there a canonical way to take into account diversity when counting votes?

6 Comments

  1. This is a very interesting question. A good answer could have big social benefits.

    In machine learning, a common meta-learning algorithm is to combine multiple learning algorithms by voting. It is well known that this meta-learning algorithm works best with a pool of diverse base learning algorithms. A natural measure of diversity among learning algorithms is conditional information:

    http://en.wikipedia.org/wiki/Conditional_information

    This suggest that perhaps each vote in, say, Digg should be weighted by its conditional information. You would need to keep a history of each voter’s voting pattern to calculate this.

    You may get some other useful ideas by searching through the machine learning literature on voting as a meta-learning strategy.

    Comment by Peter Turney — 25/1/2008 @ 10:14

  2. Clever comment Peter!

    Comment by Daniel Lemire — 25/1/2008 @ 11:15

  3. The idea of choosing a diverse subset comes up frequently in the experimental design literature, as a way to cover the parameter space well enough given the number of experiments that can actually be done. Most of the techniques presume some sort of existing distance measure, but that’s likely in a social network environment.

    Comment by Mike — 25/1/2008 @ 14:09

  4. Most of the techniques presume some sort of existing distance measure…

    We use a lot of informal “closeness” criteria in all our thinking, intuitive or not, but it could be that this is a blind alley because the “obvious” nearest neighbour metrics doesn’t scale at high dimensionality.

    So, if metrics become useless how the heck are we going to handle similarities and analogies, hey Peter (wink!), this ruins the whole spatial level of abstraction

    Comment by Kevembuangga — 26/1/2008 @ 3:00

  5. Interesting idea Daniel. I think it might apply to citation-based recommenders too, although I don’t know how to adapt Conditional Information as a way of measuring “paper diversity”.

    Comment by Anonymous — 29/1/2008 @ 20:57

  6. “anonymous” in that last comment was me.

    Comment by Andre Vellino — 29/1/2008 @ 20:58

Sorry, the comment form is closed at this time.

« Blog's main page

Powered by WordPress