Ah! Just learned about Geometric Wavelets. Most recent work on wavelets as been very uninspiring to me, but this is pretty good. Seems it has been around for a number of years (most recent paper I can see is Le Pennec and Mallat in 2000).

The idea is simple, but I only saw one talk about it, so don’t take my word for it, read the papers.

Take a function f(x) over any (convex) region S. The function is approximated by some polynomial (of fixed degree), call it p(x). Split the region exactly in two by a straight line, so that the two remaining regions S1 and S2 are still convex. Over each subregions, by regression, solve for the polynomials (of fixed degree) p1(x) and p2(x) approximating f(x). Optimize the splitting so that the approximation error made by p1 and p2 is minimal. Then your two new wavelets are the polynomials p(x)-p1(x) and p(x)-p2(x) limited to S1 and S2 respectively. These wavelets are not continuous, but if f(x) is itself a polynomial (up to a fixed degree), then the wavelets vanish. Repeat this splitting for as long as you need to.

The gist of the result is that you can do state-of-the-art image compression. This is surprising because the wavelets are rather complex and difficult to encode especially if you consider that encoding their support region is no joke. The key point is that you can define the region over which each wavelet is defined simply by encoding the various splitting lines, and that’s an easier problem than encoding polygonal regions from scratch.

I strongly suspect that these algorithms are not practical though. They use up too many CPU cycles to save up on bandwidth and storage space in a world where storage is infinite but CPU cycles are comparatively more expensive. But the idea is neat!

This is too good not to pass on. Ronen Feldman posted slides from his ICML 2006 Tutorial on Text Mining.

In this tutorial we will present the general theory of Information Extraction and will demonstrate several systems that use these principles to enable interactive exploration of large textual collections. We will present a general architecture for information extraction and will outline the algorithms and data structures behind the systems. The Tutorial will cover the state of the art in this rapidly growing area of research. Several real world applications of information extraction will be presented in the areas of business intelligence, competitive intelligence, bio information, and military intelligence.

(Source: Yuhong (by email).)

In a few hours, I’m “planing” (opposite of “deplaning”) for Avignon, France where I’ll attend Curves and Surfaces 2006. The web site is currently down.

I will be talking about monotone curves.

For applications such as pattern recognition, curve reconstruction and so on, it is important to be able to study the properties of curves and chains (a chain is just a discrete, usually finite, curve). For discrete functions, while we can’t talk about smoothness, we can talk about monotonicity, for example. A perfectly monotone (or piecewise monotone) discrete function (or signal) is unlikely to be noisy.

What is the equivalent of monotonicity for curves?

The typical definition of a monotone curve is a so-called v-monotone curve: under a change of basis where v is aligned with the x-axis, then the curve’s x-component is always increasing. For most settings, this is a very strong requirement.

We decided to look at an alternative definition of what it could mean for a curve (or a chain) to be monotone. For functions, we know that f is monotone if the inverse image of balls are connected. So, we decided that an arc-length parametrized curve s would be R-monotone if the inverse images of balls are connected. We go on to show it is a sensible definition. The definition also applies to chains. We can then filter noisy chains to increase their degree of monotonicity (R).

In the coming months/weeks, I’ll post the preprint. It is also available to those who ask by email.

Update: I posted my slides on the web. Comments are invited even if you don’t attend Curves and Surfaces.

I met Kamel in Montreal today. I invited him over for a post-doc and he arrived 2 months ahead of time (talk about eagerness!) from Lyon (Eric laboratory). Kamel is a very promising, young (well, younger than me) researcher in OLAP, Data Warehousing and Data Mining. Among other things, Kamel will be working of view size estimation in data warehouses.

Oh! How did Kamel learned about me and my work? Nah! Not a conference or a talk… nothing like that, he just got to me through my blog! So this blog has a purpose after all!

According to Slashdot, Mathematician Peter Woit of Columbia University calls String Theory ‘a disaster for physics.’ Reading the article, you learn that “Virtually every young mathematically inclined particle theorist must sign on to the string agenda to get an academic job.” I think this is probably more common than we care to admit. Academia tends to reinforce its own bias.

  • Big shot professor X comes up with an idea.
  • He gets 20 Ph.D. students to work on it. A few good results come up, but mostly, the idea remains a promise.
  • Other soon follow the new trend, or at least, they send some of their Ph.D. students to the front, just in case.
  • There is little evidence that the idea is worthwhile, but this is quickly dismissed… “we just need more research”…
  • As the trend grows, researchers who have invested a lot in the idea become numerous, and they start infiltrating hiring committees, program committees, funding agencies, and so on. Their career now depends on the fact that the trend continues. After all, trying something new is much harder than just continuing whatever we are doing!
  • The idea finally bears some fruits. There is some very limited industry adoption. It is hyped by academia the world over.
  • Nothing further comes up, people start recycling old ideas. Researchers refuse to see the lack of progress for what it is.
  • Big shot professor Y comes up with a related idea which builds on professor X’s original idea, but in a novel way”.
  • The cycle continues…

There is a simple solution to this problem: encourage independent thinking. Next time you review a paper or a funding application which attempts something fresh, don’t dismiss it because “the paper ignores 20 years of research.” We need to ignore fashions more, and encourage novel ideas. Don’t reward people because they follow the trends. We are supposed to give people tenure so that they can take risks, say things others don’t want to hear. Let’s build on that. Next time you are on a hiring committee and someone wants to hire a string theorist, or the equivalent in your field, be watchful.

Next Page »

Powered by WordPress