Monday, April 30, 2007

Ortovox S1 hits the streets running

Ortovox S1 hits the streets running:It has been a long wait for the Ortovox S1 but it seems to have been worth it. Supplies of this revolutionary avalanche transceiver are slowly making it into the shops. Original slated for the start of the 2005/2006 season the S1 is 18 months late. Development beset by a number of problems and Ortovox's desire to get things right. When we tested a sample of the S1 in January 2006 it seemed to live up to its promise of simplifying the search process, especially when there is more than one victim.

Worth reading this article and Halstead's review if you are a backcountry skier considering a new transceiver. However, some points in the article and in the review make me a bit wary:

Original slated for the start of the 2005/2006 season the S1 is 18 months late. Development beset by a number of problems and Ortovox’ desire to get things right.
It may not sound like much of an issue but in the heat of a search it is easy to outrun a beacon by rapid movements and this can lead to confusion. [...] Halstead reports that the beacon could take 12-15 seconds to catch up. A representative from Ortovox told us that a software update is scheduled for the autumn and that all current Ortovox’s will be recalled.
Only when the S1 was totally maxed out with too many transceiver signals (like 20-30 signals = think LOTS of heli skiers milling around a parking lot) did it really "lock-up," for a longer amount of time.

There is a cost to complexity in SAR gear. It can increase confusion in the field, and obscure bugs can cause major mistakes. A device that locks up in one situation may lock up in others: lock up may indicate a deadlock broken eventually by a time-out, and the possible causes of deadlock are notoriously difficult to diagnose. Long delays followed by product release followed by a mandatory software update raise software engineering process concerns. Additional complexity with the built-in digital compass may also lead to unnecessary bugs. Ortovox is a deservedly famous pioneer in transceiver development, but even very experienced designers are not immune to the digital featuritis Kool-Aid.

I like simpler. In recent transceiver practice in a multiple deep-burial scenario, the two fastest and most accurate searchers were an old-timer with an analog transceiver, and me with a first-generation digital Tracker DTS. Searchers with more advanced digital transceivers did not show any advantage.

Sunday, April 29, 2007

Series of accidents in the Alps

Series of accidents in the Alps: Despite snow cover resembling the end of May the conditions remain dangerous. The principal risk is from falls on hard snow but a Finnish ski mountaineer was caught by a slab avalanche today while skiing below les Bosses (4513 m) on the north face of Mont-Blanc.

Scary stuff. I had to deal with icy conditions a couple of weeks ago in the Tahoe backcountry. Ski crampons are your friend on the ascent. It is a pain to stop to put them on, and it has to be done on a safe spot because it requires getting out of one's bindings, but it is totally worth the delay, as I've learned the hard way. Small wind slabs can be deceptively easy to miss in spring conditions, too.

The Coming Patent Apocalypse

The Coming Patent Apocalypse: Many people in computer science believe that patents are problematic. The truth is even worse: the patent system in the US is fundamentally broken in ways that will require much more significant reform than is being considered now.

John makes some excellent points in this entry. I wrote on his comment thread:

Another perverse effect of the current regime is that it slows down progress by encouraging trade secrets at the expense of openness through the patent system. Suppose company X comes up with a way to perform a service more effectively. If they keep it a trade secret as they deploy the improved service, it will may be very difficult for others to recognize that the new service infringes on some patent, because some aspects of services can be implemented in many different ways. If, on the other hand, company X files a patent that cites as prior art a patent by company Y, they open themselves to having to pay royalties to Y, or worse.

Weinberger's Miscellany

Weinberger's Miscellany: David Weinberger, one of the smartest of our many smart neighbors, has a new book about books and planets, Staples and Amazon, 20 questions and the periodic table, Carl Linnaeus and Melvil Dewey, data and metadata -- about everything, in other words: Everything is Miscellaneous. [...] It's hard to summarize his theory of everything in one sentence, but this is pretty close: "To get as good at browsing as we are at finding -- and to take full advantage of the digital opportunity -- we have to get rid of the idea that there's a best way of organizing the world." Weinberger is the first to admit this is a mighty tall order. We were organizing the world (and, implicitly, privileging our particular organizing principles) long before Linnaeus and Dewey. As Weinberger explains, we're basically hard-wired to organize all the atoms and planets we see: "We invest so much time in making sure our world isn't miscellaneous in part because disorder is inefficient -- 'Anybody see the gas bill?' -- but also because it feels bad."

I was listening to this podcast during my hard interval workout today, and I didn't even feel that I was working out, but I was reaching the usual high intensity levels. I kept smiling and saying "yes!" to myself. Weinberger made much better some of the points I have been making here against hierarchical organizations of information in natural-language processing, the "semantic web," and other contemporary attempts to squeeze networked digital information into a traditional hierarchical organization. One of Weinberger's best observations in the show is how traditional forms or organization derive from physical space: everything has a place, and a place cannot contain two things simultaneously. Two points that Weinberger did not make -- they may be in his book, which I'll be getting:

  • It is plausible that our cognitive organization is evolutionarily tuned to those properties of physical space, and thus categorization and hierarchy appear natural and inevitable to us;
  • these physical constraints affect also how information can be organized on paper.
Even though digital memories obey the same constraints at the bit level, efficient replication and indexing create powerful abstractions that effectively erase the constraints. Forcing digital information organization into the old structures will be as silly as it would have been to force paper to degrade what is written on it to simulate the limitations of our brain memories. There is no reason to believe that search will be most effective if it forced to obey those old structures; in fact, there are reasons to suspect that categorization for the most part gets in the way in search, as we see from the repeated failure of supposedly superior approaches to search based on hierarchy or clustering.

The show discussed scientific information just briefly, mainly around the upheavals in biological classification as a result of evolution and genomics. The show did not discuss the fact that hierarchical classification through ontology development is the dominant paradigm in extremely expensive international efforts to organize digitally biomedical information. As an observer of several of those efforts, I can't avoid the feeling that these efforts are misguided, in that biological knowledge advances much faster than those systematization efforts, and new discoveries constantly cross-cut existing categories. Just as one example, until recently "gene" referred to a portion of DNA that codes for a protein, but now the study of "miRNA genes", which do not code for proteins, is all the rage.

The alternative of developing search and distributed sharing of tags and other user-constructed metadata seems much more scaleable, and also more likely to allow for approximate matches and ranked answers that may reveal unexpected associations that would just be forbidden by a fixed categorization scheme.

The show mentioned PennTags a couple of times. It's nice to hear the home team being recognized!

French presidential election: the press did better than the pollsters

2007: La presse a fait mieux que les sondeurs:Sans le savoir, la presse nationale avait prévu le résultat du premier tour presque à la virgule près. En tous cas mieux que les instituts de sondages officiels. C’est le résultat extrêmement étonnant qui sort de mon outil d’analyse de la presse, Presse 2007, qui scanne en permanence les sites de six quotidiens (Les Échos, Le Figaro, L’Humanité, Libération, Le Monde, Le Parisien) ainsi que le site Marianne 2007.

For the non-French speakers: Jean Véronis reports that a simple calculation of the relative numbers of mentions of each candidate in the major dailies predicted the outcome of the recent first round of the French presidential election significantly better than the polls. This is very interesting but not totally surprising to me given recent work by two of our undergraduates that shows that news text analysis can predict the movement of political prediction markets. We hope to prepare this work for publication soon.

Friday, April 27, 2007

EMNLP papers, tales from the trenches

EMNLP papers, tales from the trenches:I know that this comment will probably not make me many friends, but probably about half of the papers in my area were clear rejects. It seems like some sort of cascaded approach to reviewing might be worth consideration. The goal wouldn't be to reduce the workload for reviewers, but to have them concentrate their time on papers that stand a chance.

This points to a major flaw in the current "selective" conference model in computer science: the reviewing process is all or nothing, with not filtering, no iteration, and no meaningful discussion between authors and editor (no, the reply periods do not count). it's not an issue of process, but simply that there is too much to do in too short a time. In addition, the surge in reviewing demand means that program and area chairs have to scrape the bottom of the reviewing barrel just to get the job done.

The field must get serious about developing alternatives in fast turn-around journals and innovative research validation and ranking methods. If biology can do it, why can't we?

With a recently published paper in PLoS Computational Biology, I had the best reviewing experience as an author that I've ever had in thirty years of submitting papers for publication. Reviews and editor responses were detailed and very useful, allowing for effective revision and resubmission and quick publication, less than nine months from the initial submission to publication, which is not much longer than, say, ACL or NIPS, for a much better result.

Sunday, April 22, 2007

Two nature books

My recent trip to California primed picking up two books: The Last Season by Eric Blehm, about Sierra backcountry ranger Randy Morgenson, and The Wild Trees, by Richard Preston, about the search for the tallest redwoods in the temperate rain forest remains of Northern California. The Last Season, like its subject, asked for pauses. It asks tough questions of anyone who tries to balance the mountains with the rest of life. As for The Wild Trees, even though I knew part of the material from The New Yorker, I got it Friday afternoon at the Penn Book Center, and I didn't put it down for much until I finished it at 3 am Saturday.

Saturday, April 21, 2007

Visual perception and skiing

Bill Bialek gave a wonderful Pinkel Lecture at Penn yesterday. Much of the lecture was on simple models of visual categorization and tracking, including a very intriguing use of undirected graphical models (in CS jargon) to explain the joint distribution of spikes in groups of retinal neurons. Earlier in his talk, Bill mentioned in passing that low contrast leads poor velocity estimation by the neural circuitry responsible for estimating velocity from visual stimuli (Bill's well-known experiments were on flies). This is pretty obvious in hindsight, and there's in fact a long literature on it, but I never realized how it provides an obvious explanation for the difficulty of skiing in poor visibility. Falling snow, fog, or low light all reduce contrast. A typical problem for me in those conditions is overestimating or underestimating my speed. For example, in our tour on Ralston Peak last Saturday, I double-ejected from my bindings when I hit a patch of dense windpack, even though my bindings are set higher than the standard to avert such problems in the backcountry. I was just going faster than I thought. Everyone in the group fell on similar patches. The common factor was very low contrast from fog and wind-driven snow.

Another visual perception factor in skiing comes up in the advice for skiing in trees: look at the gaps, not at the trees. William Warren's models of optical flow as a control signal for object avoidance seem to fit the observations: by looking at the gaps, the average difference in optical flow vectors from the left and right visual fields indicates how fast you are moving away from the safe middle of the gap, since you tend to ski towards what you are looking at. Looking at the tree, the optical flow on the two sides is similar, and you have no lateral obstacle-avoindance signal.

The postmodern web

The postmodern web: Mark Liberman blogs on a critique of the Semantic Web by Malcolm Hyman and Jürgen Renn at the NSF/JISC Repositories Workshop. Mark makes several good and entertaining points, but he left one for me to make. He quotes the following from Hyman and Renn's text:

Web 2.0 is the protestant vision of the Semantic Web: where central authorities have failed in mediating between the real world of data and the transcendental world of meaning, smaller, self-organized groups feel that they are called upon to open direct access to this trancendental world in terms of their own interpretations of the Great Hypertext. The traditional separation between providers/priests and clients/laymen is thus modified in favor of a new social network in which meaning is actually created bottom up. The unrealistic idea of taxonomies inaugurated by top-down meausres is being replaced by the more feasible enterprise of "folksonomies" spread by special interest groups. As their scope remains, however, rather limited and the separation between data and metadata essentially unchallenged, the chances for developing such a social network into a knowledge network fulling [sic] coping with the real world of data are slim.

He then quotes several bullets from their talk outline, including the following:

Moving from servers and browsers to interagents that allow people to interact with information
Replacing browsing and searching with projecting and federating information
Enabling automated federation through an extensible service architecture
Extending current hypertext architecture with granular addressing and enriched links

These goals contradict Hyman and Renn's bottom-up fantasy. Extensible service architectures, federated information, and fine-grained links have all pretty much failed because they just import into software design the challenges of agreeing on a common vocabulary and its computational bindings among multiple parties with differing interests and no central coordination. Successful large-scale open-source projects require a BDFL like Linus Torvalds or Guido van Rossum who is respected and accepted by the great majority as a gatekeeper for requirements, design, implementation standards, and testing.

The problem is that we have no idea of how to reproduce in the computational realm the socially and cognitively self-orgainizing evolution of human language. Linguistic semantics is grounded human perception, action, and social interaction. Current computational methods are too brittle, so they require centralized design if they are to work at all. Coarse-grained modularity (operating system, applications, ...) allows some degree of distributed development, but agreement on interfaces is a major bottleneck. For these reasons, shallow, robust methods operating on mostly flat text — search as we know it — are more effective than supposedly more discriminating methods operating on allegedly richer representations. This may change as we develop more robust methods for data integration, but Hyman and Renn's "protestant vision of the Semantic Web" is at present just a cute analogy without substance.

Thursday, April 19, 2007

Tahoe Bakcountry

I had arranged for a weekend trip to South Tahoe for Spring touring after research talks at Microsoft Research Silicon Valley and Google. My tour mates were Stefan Riezler and Graham Katz. We might call ourselves Computational Linguists in the Backcountry, which I'll admit has much less cachet than Babes in the Backcountry. We left Menlo Park at 6:30 am Saturday morning and we were ready at the Echo Summit trailhead for an attempt to ski NE Ralston Peak. It was not to be. The forecast snow showers turned into a real storm with high winds and very poor visibility above 8500 ft. We slogged over the frozen lakes and rolling terrain to the planned ascent route and we trudged up in snow and fog until the poor visibility started to make it treacherous. We turned around and skied a mix of ice, dense shallow wind slabs, and sun crust with a veneer of new snow in low visibility. The worst of skiing in those conditions is that a bit of speed makes it easier to ski crud, but speed is not a good idea when rocks, small cliffs, and deep wind pillows are hiding in the murk. Still, the last 50 turns or so below the fog were enjoyable. Stefan's route knowledge allowed us to ski and skate back over rolling terrain and frozen lakes without putting skins on. The stiff wind on our backs helped too.
Although the original forecast was for partly cloudy conditions after the storm, Sunday came in hardly any better. Our target was the South route up Jakes Peak. We got there at 9 am and soon after starting skinning over the patchy snow, we hit bare steeper pitches. We walked up for a while. As we were about to give up for lack of snow, Stefan noticed a more continuous patch. Starting as a nasty steep melt-refreeze mix, the snow became more skinnable, although still challenging in places — a few inches of slippery wet snow over frozen crust — and in less than two hours we had skinned up a steep ravine and a less steep summit ridge to the South summit of Jakes Peak. The clouds did not let us see much and the wind was stiff and cold, so we turned right around and skied the same steep nicely spaced trees that we had ascended. The snow that was a challenge to skin up turned out to be quite skiable and we had a great run for 1500 vertical feet or so to the edge of the snowpack. A few hundred feet down, the clouds suddenly opened up and gave us an eerie view of Lake Tahoe under storm clouds, touched patches of intense blue light. That provided the only photos of the weekend, but my mediocre photo skills cannot do justice to the magical view. A quick downclimb got us to the car at 1 pm for the drive back to the Bay Area.
There's more to Spring skiing than sunny corn mornings...

Monday, April 16, 2007

Mount Dana

My flight from SFO to PHL routed between Tahoe and Yosemite. From my right-side window, the view South over the High Sierra was beautifully clear, with no trace of the weekend's storm that made our Saturday and Sunday Tahoe ski tours a bit of a challenge. Starting from Mono Lake, it was easy to find Mount Dana with its twin snow-filled couloirs, Dana Couloir on the left, bracketing the summit, and Dana Plateau below with its cliffs down to Ellery Bowl and to Lee Vining Canyon. Several of the plateau's couloirs were also visible. I kept checking on all of the Tioga Pass peaks I could recognize until the scene disappeared behind the plane. I wasn't able to visit Tioga Pass there this winter or last, and I heard that the last two winters have not been kind to the winter lodge operation at Tioga Pass Resort. I hope they will recover and that I can visit next winter.

Sunday, April 8, 2007

Is Google the root of all pain?

Is Google the root of all pain?: Washington Post: If all of the newspapers in America did not allow Google to steal their content, how profitable would Google be?' Zell said during the question period after his speech. 'Not very.'

Mr. Zell is at one level very confused here, because most searches and the corresponding profitable sponsored links have nothing to do with news. However, at another level, Mr. Zell is making an important point, which is that the current arrangements on the internet do not provide a reliable way of compensating news outlets for their product. However, the problem is not caused by search engines. Any news outlet could avail itself of robots.txt to stop being indexed. The problem is that the news outlets want to be found more than they want to protect their content, leading to an arms race in which outlets that block full access lose popularity compared with those that are open. In theory, a system in which outlets allowed themselves to be indexed but required some compensation for reading anything beyond a headline or snippet could address Mr. Zell's problem. Except that the system would not work if some outlets decided to be fully open, and a peace treaty among outlets to keep everyone's content restricted could well be illegal.

I can see how Mr. Zell would prefer to just get a kickback from the search engines when they show search results from news outlets. But the search engines don't profit directly from showing organic search results, just from the sponsored results, which may or may not have much connection with the organic results. And search engines already pay some outlets like the AP to be able to show their product in particular ways. It may well be that what Mr. Zell dislikes is that the information market is not placing as much value as he thinks on what his papers add to the wire services. It's an empirical question, which he could test by restricting access to some of his papers.

I feel quite virtuous here, since I subscribe to a physical newspaper that I enjoy reading by the window with my morning espresso. But I use the Web for anything else involving news. Search engines have become very profitable while offering their main services for free, while newspapers are sinking. It could be that the two types of service are very different and what worked for the search engines will not work for newspapers, but it could also be that the newspapers have been clueless. Before deprecating search engines, Mr. Zell might consider learning from them. As Portuguese folk wisdom has it, you don't trap flies with vinegar (“Não é com vinagre que se apanham moscas”).

When is Natural Language useful?

When is Natural Language useful?:
In talking about Powerset and natural language search, I am frequently asked "When is Natural Language search useful?". The idea here is that maybe there are some specific situations where you really want natural language search. My general response is that this is like asking "When is Natural Language useful?" to talk to other people? The very question assumes that there are some particular situations where you want to use natural language, and others where you would prefer to just grunt out a few words.

The answer is obviously: when the interlocutor understands what you are saying. Using complex speech with dogs may satisfy our anthropomorphic urges, but as the famous Far Side cartoon reminds us, it might not achieve all that we hope for.

The question about natural language in search is not whether it would be useful, but whether it would be usefully understood by the search engine. If the search engine is like Ginger, it might be more effective to make that clear to users. There is a steep tradeoff between depth of understanding and robustness in all current computational linguistics methods. In seeking deeper understanding, we may get deeper confusion instead, from a system that is unable to recognize its own confusion. Whether there is a useful point in that tradeoff is an empirical question, not one that can be answered by in-principle arguments.

The Real Reasons Phones Are Kept Off Planes

The Real Reasons Phones Are Kept Off Planes: Mike Elgan argues that the the real reason that cell phones calls are not allowed is fear of crowd control problems if calls are allowed during flight. Also, the airlines like keeping passengers ignorant about ground conditions. The two public reasons, interference with other systems, could easily be tested, but neither the FAA nor the FCC manage to do such testing.

The source article descends into utter silliness:

If gadgets can't crash planes, then the ban is costing billions of hours per year of lost productivity by business people who want to work in flight.

What about the billions of hours per year of lost peace, and, yes, productivity, for those of us who prefer to read, write, think without being distracted by inane nattering from the next seat? Any time I take a non-silent car on Amtrak, the great majority of the conversations I am forced to overhear are in my estimation a net loss for productivity. The author mentions but dismisses the issue:

The airlines fear "crowd control" problems if cell phones are allowed in flights. They believe cell phone calls might promote rude behavior and conflict between passengers, which flight attendants would have to deal with. [...] One way to deal with callers bothering noncallers would be to designate sections of each flight where calling is allowed -- like a "smoking section." But the ban is easier.

Yeah, right. Just like the old smoking section.

With the current problems with air travel — obnoxious security, poor service, thirst if you forget to bring water from past security, disgusting food, painfully cramped seats — loss of wireless connectivity feels like a rare benefit, an opportunity to avoid constant nattering and use one's mind.

AI, Language and Symbols

AI, Language and Symbols:
However, when it comes to communicating with other agents in what we perceive to be the real world, we have created an interface that does appear to have all of these nice qualities: symbols, structure, stereotypes and so on are all used to externalize our thoughts and as an input mechanism to grasp the inner workings of our fellow beings. And while it is attractive to believe in the emergence of intelligence via huge data sets and massive but simple processing power, that intelligence will arise from the simplest machines if only we throw enough data at them - the fact of the matter is that much of what we learn as humans we do so by the consumption of structured symbols of various types.

I didn't say not do I believe that "intelligence will arise from the simplest machines if only we throw enough data at them." However, Matt's comments seem to indicate a distortedly anthropocentric view of intelligence. Anyone who reads experimentally-grounded work on animal cognition and behavior from the last couple of decades will know that our non-linguistic relatives exhibit important aspects of intelligence that are unlikely to be based on (whatever passes for) symbol-mediated computation. Here a few highlights from my reading:

  • The Number Sense, Stanislas Dehane
  • Primates and Philosophers, Frans de Waal
  • Toward an Evolutionary Biology of Language, Philip Lieberman

As for "much of what we learn as humans we do so by the consumption of structured symbols of various types," have you counted? We don't know much about the evolutionary emergence of language except for some very tentative and controversial dates, and we certainly do not know which if any critical mutations might have been associated with it. It is at least worth considering that what Matt calls "human" is really "post-writing settled division-of-labor-based." The low reproduction error rate and long-term durability of writing relative to the evanescence of speech and human memory have played an important role in the “symbolization” of our culture. Long arguments and proofs need external memory. Euclid's “Elements” was a book, and Leibniz's formalism follows the explosive development of the printing press. We know much less about what “intelligence” was before writing, let along before speech. But we do know that “educated” people, even today, have difficulty in attributing equal intelligence to pre-literate cultures or illiterate individuals. That is, Matt's “intelligence” is a loaded term in this debate. It is (Humpty Dumpty comes to mind) what his literate and formalistic assumptions say it is rather than what experimental, operational evidence might indicate.

But how did we get from the genomic data - represented as simple sequence - to the problem of finding patterns in it? That requires all the symbolic, hierarchical structured knowledge: the genetic model.

This is the old standard canard against artificial intelligence: it was merely implanted by the programmer. My point is that the hypotheses generated by these programs were not anticipated by their creators. In science, the highest flattery is “I wish I had thought of that.’ These programs “thought” of hypotheses that their creators and others never considered. The “genetic model” is as determinant of the predictions as general relativity and quantum mechanics are determinant of the distribution of mass and energy in our current universe. Maybe totally in an idealized Laplacian universe, but irrelevantly so from the point of view of scientific discovery.

In (partial) answer to Fernando's question - clearly the parallelism of the brain is considerable, but that is not the type of scale that Larry Page is talking about (that is to say, the symbols - or units/mechanism of representation - and operations involved are quite different).

That's another assertion that needs evidence, isn't it? Matt and I can speculate endlessly of what is significant or insignificant in alleged differences between representations and computational models, but the reality is that neither he nor I know nearly enough to decide the issue. Neither does anyone else as far as I know. I repeat that I prefer experimental evidence. And the experimental evidence is on my side, in the sense that almost all the practical progress in artificial intelligence over the last twenty years has been based on improvements in methods and tools for extracting generalizations from data.

Saturday, April 7, 2007

Mountain climbers witness global warming (AP)

Mountain climbers witness global warming (AP): Mountaineers are bringing back firsthand accounts of vanishing glaciers, melting ice routes, crumbling rock formations and flood-prone lakes where glaciers once rose.

It's not just how it affects climbers and skiers, but the longer term consequences in less stable water supplies (both drought and flooding), disasters caused by decreased soil stability in mountain environments, and loss of ecological niches.

My neighborhood just replaced street lights to "increase safety." They used to be 50W, now they are 100W. Whose safety?

Wednesday, April 4, 2007

Magic skis

In February I bought a new pair Ski Trab Stelvio Freerides, a light, mid-fat, fairly stiff ski for my backcountry tours. I'm awed by their outstanding performance on everything from deep untracked powder in the Monashees to windpack and crust at Copper Mountain to powder, icy groomers, frozen moguls, sloppy crud, and very dense wind deposits at Alta last weekend. These are the lightest, most manoeverable, most forgiving and yet stable and strong-edging skis I've ever used. By far. Observers noted that I was skiing more smoothly, more balanced, faster. It's rare that a new piece of gear is such a great fit. Italian design and craftsmanship at its best.

Monday, April 2, 2007

EMI goes DRM-free on iTunes Store

EMI goes DRM-free on iTunes Store: EMI has announced that the company has gone DRM-free with its entire music catalog on the iTunes Store, and has bumped the sound quality of its files.

This will be an interesting experiment. Their use of AAC can be justified on technical grounds, but it also reduces the attractiveness of unauthorized copying in a mostly MP3 environment.

For me, the relatively low bit rate and the hassles with authorizing and deauthorizing computers when I replace them have been deterrents to buying more from iTMS. This could change if higher-quality DRM-free tracks and albums became available for artists I care about — mostly not from EMI, though.

Sunday, April 1, 2007

Belated trip report

I spent Spring break skiing out of a backcountry lodge in the Monashee range of British Columbia. Here's a brief photo report. Brief because I was too busy climbing and skiing to dig my camera out my backpack, and it was too snowy to keep the camera in a pocket (the lens fogged up). Don't let deceive you, though. Whether you are an experienced backcountry skier or you would just want to try it out, Sol Mountain Lodge is the place for you. Beautiful terrain, deep snow, cozy lodge, excellent food, and most important of all, outstanding guiding by Aaron Cooperman. I came back fitter, a better skier, and hungry for more. I'm planning to go back early January 2008, when the snow is coldest and lightest.

Why Google's AI Vision Is Wrong (Not)

Why Google's AI Vision Is Wrong:
The central theme of Google's AI is that massive scale, vast data sets and planet-sized computers will, eventually - almost naturally - result in AI. This is a weak 'vision'. The reason it upsets me is that driving for scale of this type sidesteps the fundamental power to generalize. [...] To be honest, when I talk about AI, I really mean: systems that exhibit human-like intelligence (which could be far more powerful in some dimension than a human, but ultimately with a capacity to reason, conjecture, plan and execute). AI, as used by Eric Schmidt, clearly means something more like: a useful tool.
The obvious question to ask Matt is "How do you know?" Where does the power to generalize come from? The hypothesis that it is based on a very large associative memory is at least as credible as any of the alternatives. It's certainly the case that all of the advances in information retrieval, speech recognition, machine translation, image understanding, and genomics, of the last twenty years are basically advances in extracting statistical regularities for very large data sets. No other approach has had even a teeny fraction of that impact.
Let me talk a bit more about genomics, because it touches on what we all recognize as the most impressive examples of the “capacity to reason, conjecture”: scientific discovery. Evolution predicts that the genomes of organisms will have similarities and dissimilarities governed by descent with modification and by conservation of genes for valuable traits. Computational biologists developed a variety of statistical pattern-matching methods that discover the specific conserved and mutated elements by comparing the genomes of related species. These methods are able to find generalizations that lead to experimentally testable predictions of, for instance, important conserved regulatory modules. That's just one among many discoveries that would be totally impossible without statistical generalization from huge data sets. These statistical methods are discovering new facts about biological evolution and function. Is that intelligence?
Matt's criticism appears to me archaically essentialist. For him, there must be something in “intelligence” beyond mere statistical pattern discovery. In the same way as for some there must be something to life beyond mere blind mutation and natural selection.
I prefer empirical evidence to metaphysical claims. While it is obvious that current statistical generalization methods are very coarse compared with the most refined products of the human intellect, they are already able to amplify that intellect in ways that were unimaginable just a couple of decades ago, and we see no evidence of fundamental roadblocks to further progress. Just, sometimes, failures of imagination.