Final Word on SIAM Data Mining 2007

So, the conference is over. For me, this was a pretty good experience: I was not sick, I met cool people, some folks appreciated my work, and so on. The conference was well organized: coffee was good, the hotel was well chosen, and so on. For people who know me, this is quite a review since I usually complain a lot about my trips.

However, I am a tad disappointed. Actually, I was disappointed the minute I looked at the list of accepted papers. Data Mining has lost its way.

What is Data Mining? It seems that people have totally forgotten what it is about. No, Data Mining is not Machine Learning though Machine Learning can be applied to Data Mining problems. Data Mining is primarily concerned with very large data sets. It is the essence of Data Mining. Any algorithm running in quadratic time with respect to the size of the data set is automatically out.

Data Mining is not only about prediction or classification. Data Mining is also about visualization, explanations, approximations, databases, Business Intelligence, and so on. It is about applying Map Reduce to large data sets. It is about scaling up to billions of data points. It is about dirty data.

Something is wrong about the review process: obviously, the program committee is overly focused on Machine Learning. I cannot complain because my paper was accepted, but, surely, a broader range of papers should have been accepted.

2 thoughts on “Final Word on SIAM Data Mining 2007”

  1. Well, i think it is not biased towards machine learning, but math (that is – problems that can have a well-defined optimum goal). It might be due to the M in SIAM – Mathematics, although there are also the words “Industrial” and “Applied” over there.

    For a bystander (i’ve never been nor submitted to SIAM, yet), it seems that SIAM’s accepted papers always contain more formulas than other DM conferences. not that there’s something wrong about formalizing or writing things in a compact way… but it shouldn’t be only about it (or +0,73% enhahncement in the algorithm’s efficiency).

  2. Well, to be fair, there were also a couple of interesting application papers in SDM-07. For instance,

    A System for Keyword Search on Textual Streams
    Vagelis Hristidis, Oscar Valdivia, Michail Vlachos and Philip S. Yu

    Preventing Information Leaks in Email
    Vitor R. Carvalho and William W. Cohen

    Rank Aggregation for Similar Items
    D. Sculley

Leave a Reply

Your email address will not be published. Required fields are marked *