Technorati allows time-based text mining

Matthew is reporting that technorati now allows you to plot word usage frequency over time in the blogosphere. Here’s the usage of the word “segmentation” over time:

I think BlogPulse has been offering this sort of things for some time. I’m confused by the relationship between these various services. However, these services could benefit from OLAPish concepts (shameless plug):

Steven Keith, Owen Kaser, Daniel Lemire, Analyzing Large Collections of Electronic Text Using OLAP, APICS 2005, Wolfville, Canada, October 2005.

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

One thought on “Technorati allows time-based text mining”

  1. you can do this and more yourself using their API, a unix shell account and a php/perl script running on cron. I’m planning on doing it… (just reached your blog accidentally).

Leave a Reply

Your email address will not be published. The comment form expects plain text. If you need to format your text, you can use HTML elements such strong, blockquote, cite, code and em. For formatting code as HTML automatically, I recommend tohtml.com.

You may subscribe to this blog by email.