Earning My Turns: November 2007

Thursday, November 29, 2007

bumps and valleys...

I was using Google maps for its satellite view and noticed a terrain view tab ... (Via tingilinde.)

How come Steve chose one of the most popular near-backcountry skiing destinations for this example, the Catherine's/Lake Mary/Wolverine area NE of Alta? Is he trying to rub in the lack of snow there until a few days ago?

Monday, November 26, 2007

Moteurs : Comparaison Google-Yahoo

Moteurs : Comparaison Google-Yahoo: Le résultat le plus étonnant provient de l’utilisation de Wikipedia. Cette utilisation était marginale en décembre 2005 (voir étude). A l’époque, sur l’ensemble des 10 résultats de la première page, Google retournait 2% de liens provenant de Wikipedia et Yahoo 4%. Sur le premier lien seul, Google ne retournait aucun résultat de Wikipedia (du moins dans notre échantillon) et Yahoo 7%. [...] La note moyenne attribuée par les utilisateurs lorsque le résultat est dans Wikipedia est de près d’un point supérieure, dans le cas de Google comme de Yahoo, à la note attribuée aux autres résultats. Pousser Wikipedia à l’extrême est donc une stratégie payante à peu de frais. Elle est toutefois dangereuse. Le jour où les utilisateurs s’apercevront que, par exemple grâce à la barre de recherche de Firefox, ils peuvent chercher directement dans Wikipedia s’ils veulent des informations encyclopédiques, dans Wikio pour l’actu et les blogs, dans Allociné pour le cinéma et ainsi de suite, le concept (vieillot, à mon sens) du moteur généraliste aura du plomb dans l’aile. On commence à en percevoir les limites. (Via Technologies du Langage.)

Interesting study, but I disagree with this conclusion, at least in the absence of further experimental investigation. While Wikipedia may be a convenient first stopping point with average high credibility, having a diversity of sources on the first search page is very important. Often I search for a term that I have a basic understanding of to get more specialized resources, when the Wikipedia entry does not add much I don't know already. Encyclopedias are good as a first entry point into a subject, but not that good for detail, associated material, or timeliness.

This Climate Goes to Eleven

This Climate Goes to Eleven: Gerard H. Roe and Marcia B. Baker, "Why Is Climate Sensitivity So Unpredictable?", Science 318 (2007): 629--632 [...] Roe and Baker's argument is simple but ingenious and compelling. The climate system contains a lot of feedback loops. This means that the ultimate response to any perturbation or forcing (say, pumping 20 million years of accumulated fossil fuels into the air) depends not just on the initial reaction, but also how much of that gets fed back into the system, which leads to more change, and so on. [...] Suppose, just for the sake of things being tractable, that the feedback is linear, and the fraction fed back is f. [...] What happens, Roe and Baker ask, if we do not know the feedback exactly? Suppose, for example, that our measurements are corrupted by noise --- or even, with something like the climate, that f is itself stochastically fluctuating. The distribution of values for f might be symmetric and reasonably well-peaked around a typical value, but what about the distribution for G? Well, it's nothing of the kind. Increasing f just a little increases G by a lot, so starting with a symmetric, not-too-spread distribution of f gives us a skewed distribution for G with a heavy right tail. (Via Three-Toed Sloth.)

Interesting study. Besides the scary aspects of this finding relative to climate change, there's the more academic question of whether these types of processes can explain skew distributions we find in other fields.

Friday, November 23, 2007

Art, Science & Truth: Deep Nonsense

Art, Science & Truth: Jonah Lehrer: Reading Jonah Lehrer's Proust Was a Neuroscientist is something like watching Jacoby Ellsbury in the Red Sox outfield. [... ] Lehrer's stylish little book is a brief for art in an age of science. He stands with artists, for starters, because as he argues in eight signal lives, they hit the target first, about brain science in particular: poet Walt Whitman's intuition of "the body electric," for example; or novelist George Eliot's confrontation with systems thinking (Herbert Spencer, in person, and the invented Casaubon in Middlemarch) and her elevation of the indeterminacy of real life; or Paul Cezanne's methodical discovery of our eye's part (and our imagination's) in completing the experience of a painting. [...]

Scientists describe our brain in terms of its physical details; they say we are nothing but a loom of electrical cells and synaptic spaces. What science forgets is that this isn't how we experience the world. (We feel like the ghost, not like the machine.) It is ironic but true: the one reality science cannot reduce is the only reality we will ever know. This is why we need art. By expressing our actual experience, the artist reminds us that our science is incomplete, that no map of matter will ever explain the immateriality of our consciousness.
Jonah Lehrer, Proust Was a Neuroscientist, page xii.

(Via Open Source.)

I listened to this podcast at the gym today. I must have worked out harder to burn off the irritation with so much flim-flam. Science is Chris Lydon's weakest area by far. He's too willing to accept mystical pieties from his subjects that he would probe sharply in an interview about Iraq or Emerson.

Proust and Musil supplied important places away from my research when I was in graduate school. Reading À la Recherche du Temps Perdu in the original required such concentration that the difficulties with my work were erased for a while. But neither Proust nor Musil were really outside my most serious research concerns. Proust on Elstir or Musil on Moosbrugger raised tantalizing questions about perception, consciousness, and free will. So I was ready to be sympathetic towards Lehrer's book, which was in my “to read” list. No more.

In the interview, Lehrer talks in hushed tones about the “essential mystery” of individual experience that cognitive science will “never” answer. Lydon seems almost relieved that there's some mystical core left after all.

Lydon doesn't realize that Lehrer's mystery is trivial, a result of confusion between the particular and the general.

What cognitive science seeks is a general account of cognitive mechanisms. What art provides are particular accounts of experience, valuable exactly because of their particularity. A general account of cognitive processes cannot predict particulars any more than the logic diagram for this Intel Core Duo can predict what instruction will execute next. That instruction is determined by a combination of the processor, the contents of memory, and events in the outside, like the keys I tap and the packets that arrive on the net interface.

Even if we had a complete wiring diagram of someone's brain, we could not predict the next neuron firings, let alone the next action of the subject, because we don't know the contents of memory, encoded in the states of synapses and of individual neurons (such as feedback-stabilized patterns of gene expression), nor what particular sensory events will happen next.

More generally, Lehrer seems to be totally oblivious to the huge 20th century discovery that unpredictability is the rule for sufficiently powerful computing devices. The unsolvability of the halting problem is just the most extreme case of unpredictability: no general method can predict in finite time whether an arbitrary program will halt. A good pseudo-random number generator is unpredictable if we do not know its seed. Thinking of individual experience as a unique bit stream, it is not surprising that individual behavior is so unpredictable: we all have different seeds. In addition, cryptographic arguments show that a combinatorial circuit of sufficient complexity cannot be reconstructed from a polynomially-sized sample input-output behavior.

If Lehrer wanted to puzzle over a real question that matters in this argument, he could have asked about our current lack of proof for the cryptographic assumptions used in the above argument. Now, there's a mystery. Not an “essential” one, we hope, but certainly a resistant one.

It is somewhat depressing that even highly educated people like Lehrer are so ignorant of the amazing discoveries on the limits of computation since 1936, and what they may imply for our understanding of the mind; and that they seem ready to go all weak at the knees with mystical copouts as soon as the opportunity presents itself.

To admire Proust or Musil I need no mystery: it is enough that they could create compelling experiences that illuminate the uniqueness and weirdness is all of us, which will stand however much we know about brains and minds, not because of any mystery, but because computation has limits. Unpredictability makes us free.

Update: Complementary claims of nonsense.

Thursday, November 22, 2007

In memoriam Maurice Gross. (arXiv:0711.3452v1 [cs.CL])

In memoriam Maurice Gross. (arXiv:0711.3452v1 [cs.CL]): Maurice Gross (1934-2001) was both a great linguist and a pioneer in natural
language processing. This article is written in homage to his memory (Via cs updates on arXiv.org.)

I met Maurice Gross a few times. He had long-standing connections with Penn, and he was a charming host when I visited his lab for a thesis defense. After the event, we had a memorable dinner at a local restaurant, where Maurice amused us with stories about his country house, wine-making, and I'm sure many other topics I don't remember now. He recommended that I try clafoutis for desert; I had never had it before, and it was superb. This experience agrees well with Eric Laporte's account.

Maurice Gross was right and ahead of his time in his focus on the local grammar of lexical items, and in recognizing the combinatorial uniqueness of individual lexical items, in contrast with the very impoverished tag sets of standard generative grammar. The use in his laboratory of finite-state transducers for local grammars was highly original. However, I'm less sure that their specific approach is sufficient. The local grammar approach imposes extreme constraints on the interactions between lexical items, and seems too brittle to handle natural variation. Local grammars are better as compact summaries of observations than as models of the interactions and variations that may occur. Lexicalized TAG has a similar flavor of local grammar, but it allows greater combinatorial flexibility and generalization. Still, both the overall view of language and the specific methods that Maurice Gross pioneered deserve continued study. In our rush to build theories and systems, we keep forgetting that language is much more an assemblage of particulars than the neat result of a few general principles. We need to savor it slowly, as if were sitting for dinner with Maurice.

Wednesday, November 21, 2007

Hands on with Kindle

Hands on with Kindle: [...] The layout? Not so great. Forced justification with apparently no hyphenation dictionary or hinting in the format. That's a huge failure. On a private list, I noted that, "Justification without hyphenation is like taxation without representation." That is, the poor letters and word groupings have no input into how they're displayed, which makes for a poor republic.

(Via TidBITS.)

That's all we need to know about the Kindle. Thanks, Glenn Fleishman!

Thursday, November 15, 2007

Edgar Bronfman, Jr. Reported to Talk Straight

Edgar Bronfman, Jr. Reported to Talk Straight: Edgar Bronfman, Jr.'s efforts to become a powerful figure in the entertainment world, via money derived from his father's Seagram empire, are long-standing. [...] The boss of Warner Music has made a rare public confession that the music industry has to take some of the blame for the rise of p2p file sharing.

"We used to fool ourselves,' he said. "We used to think our content was perfect just exactly as it was. We expected our business would remain blissfully unaffected even as the world of interactivity, constant connection and file sharing was exploding. And of course we were wrong. How were we wrong? By standing still or moving at a glacial pace, we inadvertently went to war with consumers by denying them what they wanted and could otherwise find and as a result of course, consumers won."

(Via The Patry Copyright Blog.)

It only took ten years for him to start getting a clue. Efforts to work with him and other music executives on digital distribution go back much further than the start of Apple's music store. It's a sad reflection of their stewardship of their business that instead of leading the charge to digital, they dug in their heels for a decade and only started waking up when p2p and Apple pushed them into a corner. I'm sure their shareholders are delighted.

making bicycling safe in the us

making bicycling safe in the us: Important stuff - what Berkeley has done. It isn't the Netherlands, but far better than most US cities (except for Davis, CA perhaps) (Via tingilinde.)

I'd be a bit more sympathetic to such initiatives if it were not that in over six years in Philly as a pedestrian and public-transportation user, most times I've come close to injury have been due to cyclists switching between roadway and sidewalk at high speeds, riding against the traffic in one-way streets, and riding through red lights.

There's a holier-than-thou attitude among cycling activists that really rankles. Sharing the road applies to all road users, not just motorists.

Tuesday, November 13, 2007

Cristina Branco

CRISTINA BRANCO canta SHAKESPEARE: My sister links to a YouTube clip of Cristina Branco singing Se a Alma te Reprova, a translation of Shakespeare's Sonnet 136, which is also included in the album Sensus (also available as an Amazon MP3). For me she's the best Portuguese singer in a long time, musically more creative and emotionally more subtly expressive than others who are better known. (Via aguarelas de Turner.)

Tuesday, November 6, 2007

Jim Lehrer sticks up for traditional media

Jim Lehrer sticks up for traditional mediaThe bloggers are talkers, commentators, not reporters. The talk-show hosts are reactors, commentators, not reporters," Lehrer said. "The search engines can search but do not report. All of them, every single one of them, have to have the news in order to exist and thrive.

Let's grant him that. The question is then, how will serious reporters outside public broadcasting make a living as the paper media shrivel and the broadcast media replace news by infotainment? Even if online advertising would be a sufficient revenue source to support the reporters, how would the revenue flow to reporters without the current (obsolescent) news production mechanisms? Maybe some new form of syndication, but we don't have those mechanisms in place yet (open access scientific publication suffers form a parallel problem). We better start researching the design of new syndication mechanisms.

Friday, November 2, 2007

Open Source ML Software Track in JMLR

Open Source ML Software Track in JMLR: What a great idea. (Via Cranial Darwinism.)

I agree, but then I had a teeny bit to do with developing the prospectus for this new JMLR track, which came out of a workshop at last year's NIPS. At the workshop, I argued that resources follow academic recognition and citation, so we need a means to have software peer-reviewed and cited to attract more resources to open-source development efforts in machine learning.

Earning My Turns