Anatomy of a Paper: Part I, Inspiration: I will never understand how people can suggest replacing conferences or seminar visits with talks broadcast over the internet. That’s like trying to improve a restaurant experience by making sure the plates and cutlery are really shiny, and doing away with the food entirely. Conferences aren’t about talks, although those are occasionally interesting. [...] They’re about the ongoing low-level interaction between the participants at meals and coffee breaks. That’s where the ideas get created! Then you can each go home and apply yourself to the nitty-gritty work of turning those ideas into papers. (Via Cosmic Variance.)
Monday, July 30, 2007
Thursday, July 26, 2007
Next after these messages: people who learn how to swim do better in water sports, study shows.
Wednesday, July 25, 2007
More on Pat Schroeder's comments on the NIH policy: William Walsh, Schroeder follows Dezenhall's script, Issues in Scholarly Communication, July 24, 2007. Excerpt:
There's a nice story on the NIH proposal this morning in Inside Higher Ed. (See Peter Suber's comments on it.) In it, Pat Schroeder, president of the AAP, seems to be following the script laid out for publishers by pricey consultant Eric Dezenhall.(Via Open Access News.)Schroeder, of the publishers’ association, acknowledged that opinion in higher education has shifted in favor of open access. But she said that was based on a lack of knowledge. “Any time you tell somebody they are going to get something for free, they think ‘yahoo.’ ” The problem, she said, is that “no one understands what publishers do.” If academics realized what publishers did with the money they charge — in terms of running peer review systems — they would fear endangering them.
My experiences with the peer-review systems of the open access journals JMLR, BMC Bioinformatics, and PLoS Computational Biology are all much better than those I've had with many closed access journals over the years. The quality of a peer review system comes from the commitment and skill of the scientific editors and from a well-chosen workflow system, not from paper pushers at headquarters, who in some cases serve mainly to slow down the process.
Sunday, July 22, 2007
I flew from SFO to PHL yesterday on United, checking one bag. At the bag claim in PHL, at first I thought my bag hadn't made it. but it had, except that it had on it an unexpected bright green TSA-approved lock. Which I could not open. After talking with the pleasant United baggage man, we figured out what might have happened: the TSA bag screeners took the lock from another bag using their master key and put it on my (possibly similar) bag by mistake. Unfortunately, it was after 11 pm and the TSA office in PHL was closed. Today I called them, but I ended up in voice-mail hell. Just went to the local Home Depot to get some bold cutters, the silly lock has gone to join the choir invisible of vacuous security precautions.
Given how trivial it was to remove the lock with bolt cutters, why are people wasting their money on these?
Saturday, July 21, 2007
EU Google Competitor Project Gets Aid Worth $166 Million: [...] Dow Jones reports: "The aim is to develop new search technologies for the next generation Internet, including 'semantic technologies which try to recognize the meaning of content and place it in its proper context.' The semantic Web has been considered the next evolution of the Internet at least since Tim Berners-Lee, widely considered a creator of the current version of the Internet, published an article describing it in 2001. In theory, a semantic Web could receive a user request for information about fishing, for example, and automatically narrow the results according to the user's individual needs rather than blanket the user with pages related to numerous aspects of fishing. The Commission's funding approval Thursday immediately sparked talk of building a potential European challenger to Web search leader Google Inc." (Via Slashdot.)
I fear that the EU is suffering from magical thinking of the kind identified by Drew McDermott in his classic Artificial Intelligence and Natural Stupidity (not available online), which should be required reading for everyone studying or investing in this area. Calling some technology "semantic" doesn't make it so. All search engines try to "recognize the meaning of content and place it in its proper context." It's just that doing so accurately and efficiently in general is extremely hard. Significant progress depends on unpredictable research advances, not on predictable development efforts. Putting around 1000 person years on a focused project like this creates false expectations and actually hurts basic research in the field.
Competition is search is good. The major search engines have substantial research efforts, as could for instance be seen from their publications at the recent natural-language processing conferences in Prague, and there are several startups exploring new approaches in the field. More research in this area is good. But the EU should have learned from the limited success of big initiatives from EUROTRA to the framework programs that major advances cannot be willed by bureaucratic fiat.
The seeds of current search technology were not in major coordinated development efforts, but in academic research at schools like Berkeley, CMU, Cornell, and Stanford, and in unpredictable benefits from industrial research at Bell Labs, IBM, and PARC in areas like machine learning and information retrieval. None of this work came from a big grand plan, but rather from the initiative of researchers and research managers in exploiting the resources available to them (and it could be argued that the current funding climate in the US, which puts greater emphasis on top-down initiatives and applicability than before, may well reduce the creativity of the research system here). The most important effect of these efforts was not in technologies, but in creating opportunities for creative people (students, faculty, researchers) to play with new ideas and recognize their potential. Without institutional reform in Europe to open up comparable opportunities through increased flexibility in education, research, and funding, much of these $166 million will end up as institutional welfare payments to hidebound universities and corporations, as has been the case for much of the previous EU investments in research and development.
Monday, July 16, 2007
Monday, July 9, 2007
RIM's CEO sees iPhone as "dangerous": Research in Motion head Jim Baisilie believes the iPhone could be potentially toxic for the cellphone industry, according to a recent interview. The head of the BlackBerry firm points out that while AT&T has obtained a multi-year contract for the device, the terms leave the carrier out of much of the sales process and give it little influence over the customization of the phone's hardware or software. [...] "It's a dangerous strategy. It's a tremendous amount of control," he says. "And the more control of the platform that goes out of the carrier, the more they shift into a commodity pipe." (Via MacNN.)
I'm scared. Imagine, carriers having to focus on sending packets to their destination instead of forcing on us awfully designed, crippled, restrictive applications and services. Carriers providing an arena for unfettered innovation rather than keeping us frozen in 1960s communication and software models. That can't be allowed. After all, what would that do to proprietary platforms like RIM's?
Friday, July 6, 2007
Automated versus Human Judgments: A couple of posts provoke an interesting discussion: William Cohen points to the issue of the popularity contest approach to ranking which may have undesirable consequences [...] As for the issue of automated ranking of web pages. The problem cited above exposes the frailty of addressing a content problem (finding a document whose text is appropriate) via an orthogonal structural solution. The structural solution (counting links and propagating results) may do well in some domains where it is regarded as a proxy for measurements of 'authority', however, the ambiguity in the structure cannot be determined, leading to the type of problem William cites. This is where solutions like Powerset come in. (Via Data Mining.)
Thanks for the link to William's blog which I didn't know about. Regarding this particular search ranking issue, where related lexical items have very different contexts of usage and associated sentiments — negative vs neutral or positive — I'm curious of what NLP methods, embodied Powerset's system or in any other system, or even in early research prototypes, would Matt recommend for a solution. The problem is not one of syntax, semantics, or even pragmatics and local discourse, as it can be easily seen from several controversies in this country where a word is considered derogatory when used by some people but friendly or even complimentary when used by others; and people can and will get into hot water when they breach those invisible but very real boundaries. There's a lot more in the context and charge of writing than any of our current automated methods can discern, whether they use global statistics or local structure. It's not a matter of ambiguity — the denotation of the terms is not in question — but one of association and rhetorical force — what ideas and feelings are triggered in the minds of different readers and writers by particular terms as a result of their social and cultural backgrounds and of their (lack of) sensitivity.
The original post by Lauren Weinstein that triggered this thread was about the visible global impact of search rankings, but William's discussion suggests a less global but possibly more powerful effect in search personalization, of whether a personalization algorithm could become a strong reinforcer of prejudice without the counter-pressure of critical discussion of globally visible search rankings.
Thursday, July 5, 2007
Elsevier invites Google and Google Scholar to index its journals: Peter Brantley, Science Direct-ly into Google, O'Reilly Radar, July 3, 2007. Excerpt: ScienceDirect (SD) is a compendium of scientific, technical, and medical (STM) literature from Reed Elsevier [...] Ale de Vries, the SD product manager, informs me in an email: “About Google/Google Scholar: we're making good progress. As you may be aware, we did a pilot with some journals on SD first, and now we are working to get them all indexed. We're making good progress there - it's a lot of content to be crawled, but going along nicely. Both Google Scholar and main Google are gradually covering more and more of our journals. ” (Via Open Access News.)
Several other closed-access journals are already indexed by Google Scholar. For an academic user like me, it is very convenient to have unified search across all scholarly sources, open or closed, rather than having to search separately my institutional e-resource index and the open-access literature. It is plausible to assume that accesses to closed resources have been dropping as a fraction of all accesses as open resources become more available. Certainly, I am much more likely to search on Google Scholar than on SD even though Penn gives me access to it. For Reed Elsevier, Google indexing will bring more traffic into SD from users like me, which will help them justify their high subscriptions to budget-pressed academic libraries.
Wednesday, July 4, 2007
My sister blogs at aguarelas de Turner and she just linked to a beautiful photo tour along the route of tram 28 in Lisbon. Memory sleeps for a long time, and then the right images wake it up with a start.