Saturday, July 21, 2007

EU Google Competitor Project Gets Aid Worth $166 Million

EU Google Competitor Project Gets Aid Worth $166 Million: [...] Dow Jones reports: "The aim is to develop new search technologies for the next generation Internet, including 'semantic technologies which try to recognize the meaning of content and place it in its proper context.' The semantic Web has been considered the next evolution of the Internet at least since Tim Berners-Lee, widely considered a creator of the current version of the Internet, published an article describing it in 2001. In theory, a semantic Web could receive a user request for information about fishing, for example, and automatically narrow the results according to the user's individual needs rather than blanket the user with pages related to numerous aspects of fishing. The Commission's funding approval Thursday immediately sparked talk of building a potential European challenger to Web search leader Google Inc." (Via Slashdot.)

I fear that the EU is suffering from magical thinking of the kind identified by Drew McDermott in his classic Artificial Intelligence and Natural Stupidity (not available online), which should be required reading for everyone studying or investing in this area. Calling some technology "semantic" doesn't make it so. All search engines try to "recognize the meaning of content and place it in its proper context." It's just that doing so accurately and efficiently in general is extremely hard. Significant progress depends on unpredictable research advances, not on predictable development efforts. Putting around 1000 person years on a focused project like this creates false expectations and actually hurts basic research in the field.

Competition is search is good. The major search engines have substantial research efforts, as could for instance be seen from their publications at the recent natural-language processing conferences in Prague, and there are several startups exploring new approaches in the field. More research in this area is good. But the EU should have learned from the limited success of big initiatives from EUROTRA to the framework programs that major advances cannot be willed by bureaucratic fiat.

The seeds of current search technology were not in major coordinated development efforts, but in academic research at schools like Berkeley, CMU, Cornell, and Stanford, and in unpredictable benefits from industrial research at Bell Labs, IBM, and PARC in areas like machine learning and information retrieval. None of this work came from a big grand plan, but rather from the initiative of researchers and research managers in exploiting the resources available to them (and it could be argued that the current funding climate in the US, which puts greater emphasis on top-down initiatives and applicability than before, may well reduce the creativity of the research system here). The most important effect of these efforts was not in technologies, but in creating opportunities for creative people (students, faculty, researchers) to play with new ideas and recognize their potential. Without institutional reform in Europe to open up comparable opportunities through increased flexibility in education, research, and funding, much of these $166 million will end up as institutional welfare payments to hidebound universities and corporations, as has been the case for much of the previous EU investments in research and development.


Steve Renals said...

I think your comments make sense, but I believe that Theseus is a German national (ministry of technology) project, rather than an EU project. Thus the relevant comparisons would be projects such as Verbmobil and Smartweb.

Fernando Pereira said...

The original story linked to discussed EU funding for Theseus. Unfortunately, that link is now stale. I did a bit more search, which if anything reinforces my concerns.