Sunday, May 31, 2009

Podcasts and authors

Recently I caught up with a backlog of Radio Open Source podcasts where Chris Lydon interviewed authors of recently published fiction:

I knew Hemon from Nowhere Man, and I have his The Lazarus Project on deck on my bedside table, but I didn't know the others. Their wonderful conversations with Chris Lydon convinced me that I need their books. Tinkers is already on deck.

Data set selection

On Moon Landings, Michelle Malkin, P-Values, the Clintons, and the Magical Mystery Dealergate Conspiracy Theory: [...]The way this data is being used is almost the same. Singer ran six sets of regression analysis: one each for Obama, McCain, Clinton, Democratic and Republican donors, and another for those dealers who had made no political contributions at all. She was therefore testing six hypotheses. If these hypothesis were independent from one another (which, to be clear, in this case they aren't), the odds that at least one of the six would return a p-value of .125 or lower are better than 50:50! Not only are false positives possible -- they are practically inevitable, particularly if you test enough hypotheses and tolerate a low enough threshold for statistical significance. [...] (Via FiveThirtyEight.com: Electoral Projections Done Right.)

I feel so much better that it's not just machine learning that practices the arcane crafts of post hoc hypothesis and data set selection.

Saturday, May 30, 2009

Study: hacks often bamboozled by flacks

Study: hacks often bamboozled by flacks: Steven Woloshin et al., "Press Releases by Academic Medical Centers: Not So Academic?", Annals of Internal Medicine, 150(9): 613-618:

Background: The news media are often criticized for exaggerated coverage of weak science. Press releases, a source of information for many journalists, might be a source of those exaggerations.
Conclusion: Press releases from academic medical centers often promote research that has uncertain relevance to human health and do not provide key facts or acknowledge important limitations.
[...]The best thing, it seems to me, would be to enrich the journalistic ecosystem with more species in niches like the one that Goldacre's Bad Science column occupies — agile, razor-clawed predators culling the herds of science-news herbivores that graze the green shoots of press releases on the endless media plains. (Via Language Log.)

Or like Language Log, Real Climate, The Loom, Statistical Modeling, or Effect Measure, to mention some of my current reading. The blogospherian explosion has created a wealth of innovative lineages. I don't know how they will evolve and survive, but we are already getting more informed discussion of science than we ever did from “the press.”

Monday, May 25, 2009

Computation != Deliberation

Travel, work, all-consuming new research ideas keep getting in the way of blogging, and slowing down reading. I'm still struggling with Out of Our Heads, which keeps switching between infuriating cluelessness about computation and intriguing insights about the lack of a clear-cut boundary at the information processing level between the “inside” and the “outside” of the brain. Alva Noë repeatedly assumes that “computation” in the mind is just a kind of rule-following conscious behavior. Following this misconception, the intuitive leaps of an expert chess player or human recognition of faces are according to him are not (plodding, deliberate, searching, rule-following) computation. There must be something really weird in the coffee at the Department of Philosophy at Berkeley that keeps some of them (Dreyfus, Searle, Noë) from recognizing that even a simple amoeba computes to maintain some awareness of and ecologically appropriate behavior towards their shifting environment. (Thank you Dennis Bray for the a propos example).

To paraphrase again their colleague Brad De Long's long-running lament: why oh why can't we have more computationally literate philosophers?

Monday, May 4, 2009

Diversity in scientific data

How important is WolframAlpha?: I don’t know those areas well enough to give an example that will hold up, but I can imagine WA becoming the first place geneticists go when they have a question about a gene sequence or chemists who want to know about a molecule. (Via Joho the Blog.)

These are the worst possible examples. I've worked quite a bit with biologists and medical researchers, and the last thing they want is a single source for their research data. Genomic sequences or the 3-D structures of complex molecules are works in progress, with many sources with different strengths and weaknesses. Two of my recent bioinformatics papers are on how you can get better genomic annotation by combining multiple sources of evidence developed by different researchers with different methods. Much of the current progress on genomics, proteomics, and systems biology is about different approaches to annotation and information integration, and advances from comparing and combining different types of information.

Highly curated, single-source data is useful only in those areas where how the data is collected and curated is not a central part of the scientific debate. I can't think of a single area of science that I follow in which the core data are settled, from biology to linguistics. Diverse sources, openly exchanged, contrasted, and combined, are the lifeblood of data-driven science.

Sunday, May 3, 2009

Parental bragging

You'll probably hate it, Daniel, but I'm too happy and proud to not blog this news:

Daniel Pereira teaches English at a small, private school in Springfield that serves students who have dropped out or somehow fallen through the cracks in public schools. He shares his love of literature at GW Community School and over time has served as a mentor for students struggling with drug addiction, social anxiety or learning disabilities.
He uses offbeat techniques to engage students, such as teaching a class on graphic novels or using a Run-DMC song to teach iambic pentameter. As the school's college counselor, he also helps many students make a sometimes difficult transition. One parent wrote that his son, who was unhappy and shut off as a teenager, began to pay attention in Pereira's class. The teen developed interests in poetry and philosophy and is studying creative writing in college.
"Students often say that Mr. Pereira is the toughest teacher they've ever had, but also their favorite," wrote Alexa C. Warden, the school's director, in her nomination of Pereira.

Backlog

Too much happening, not enough time to write properly about each:

  • Broke my ankle from a fall on a Donner Pass chute.
  • Moved from Palo Alto to Menlo Park.
  • Playing with NumPy for machine learning experiments.
  • Got stuck in Alva Nöe's book where he goes off the rails discussing computation.

50 to 1

50 to 1: As Greg says, Tufte would be proud 

(Via tingilinde.)

Superior visual communication. Besides the bus-out-of-wrecked-cars, the real time car counter nails it.

Mariza

Went to Oakland's beautiful Art Deco Paramount Theatre last night to hear Mariza. Even for those of us who became utterly cynical about fado as it was force-fed to us by the censored radio of an oppressive regime (maybe especially for us), Mariza live breaks the cynicism. She acts the songs, she shares good jokes with the audience and the band, she recreates fado standards by bringing out a rhythmic core that had been swamped by treacle in the “official” renderings, she takes songs and poems that we had consigned to the dustbin of self-indulgent lament for lost glories and loves and revives them in a fierce, self-aware fight to take this culture from the hypocrites that exploited and suffocated it. She may not always succeed (there's too much baggage in 400 years of self-pitying colonialism), but she fights with such intelligence and energy that she completely won this audience, even us cynics. Her band (Angelo Freire, Diogo Clemente, Marino de Freitas, Vicky Marques, Simon James) is an outstanding group of traditional and contemporary musical talent from Portuguese guitar (Angelo Freire) to samba-inspired drums (Vicky Marques).

Maybe only the daughter of an European father and an African mother, growing up in the not-so-subtly racist former colonial capital, could have given its music back to a culture still paralyzed by guilty denial.