Sunday, July 1, 2018

Jazz wonders

  • Heard Christian McBride's New Jawn Quartet at Stanford last night. I knew it would be good, but it was way above good, it was revelatory. All new music, no pretension, no concession to easy listening, superbly tight. I had heard them at SFJAZZ in 2016, they were good but they did not make this intense impression. Yesterday, Waits was like a fast current in a deep channel, many overlaid rhythms, no derivative splash. Waits and McBride set an intense pace, which Evans and Strickland rode creatively without ever going lax or derivative. Still recovering, like a scary steep ski descent. They'll be recording, album expected Sept-Oct. But best, they are touring widely. Go hear them!

  • Great detailed review of Charles Lloyd and the Marvels + Lucinda Williams's Vanished Gardens.
  • Wonderful, long historical interview with Bill Frisell. Includes a link to a bootleg video of Paul Motian Trio (Motian, Frisell, Joe Lovano) at Jazz em Agosto, Lisboa 1986. Frisell and Lovano look so young!

Monday, June 25, 2018

Photos of my work office small music player

Music SSD in the foreground, player in the in the middle, DAC (green/black) and amp (blue) in the background. SSD is connected to one of the Pi 2 USB ports. Digital audio goes out through an BNC connector from the 502DAC bolted to the back of the Pi touchscreen and plugged into the Pi hat connector. BNC coax cable connects to the DAC. Left picture shows the main piCorePlayer menu; right picture shows the Now Playing screen, in this case for the first track of Marc Ribot's excellent Silent Movies.

Saturday, June 23, 2018

Building a small standalone digital music player

For the last several years, all my recorded music listening has been from lossless PCM FLAC stored on a Synology NAS. I went through several iterations of Ethernet-based streaming from the NAS, but now I've settled on a Roon server on a Ubuntu Intel NUC that streams to a couple of different Roon endpoints: an Auralic Aries Femto for the living room speaker system, and an Allo USBridge for the study headphone system. Those endpoints connect to DAC>amp>transducer chains.

For work, however, I want my music system there to be standalone for convenience and security. After a bit of exploration, I settled on the following hardware:
  • Pi 2 single-board computer.
  • AdaFruid 7'' touchscreen.
  • SmartiPi touchscreen stand and Pi case.
  • Pi2 Design 502DAC, to be used as a high-quality S/PDIF source, not as a DAC. A possible alternative would be the Allo DigiOne board, which I did not consider at the time for whatever reason.
  • Samsung 1TB USB-C SSD for music storage.
  • sBooster ECO 5-6v LPS to power the above. Probably overkill, but you definitely need 3A@5V, which is more than the typical USB wall wart.
The player connects with a S/PDIF coax cable to my Soekris dac1541 DAC/amp, which on its own it is a very nice source for my MrSpeakers Æon Flow Closed headphones. However, I currently have an extra Neurochrome HP-1 headphone amp, and that sounds even better between the DAC and the headphones than the dac1541's built-in amp.

What remained was to find standalone music player software that would run well on that low-power computer. I started with RuneAudio, which worked but was was sluggish, often missed command touches, and sometimes got stuck doing harder chores. Scrolling through my 1TB music library was really annoying. I also tried Volumio, but I could not get it to run stably on my hardware.  Eventually, I heard about piCorePlayer on an audio forum as being really lightweight -- it runs from RAM disk -- and decided to give it a try. All of those three Linux-based players are mainly designed to stream music from a separate server or streaming service, but RuneAudio and Volumio can in principle work standalone from a local USB disk.

My adventure was to try to get piCorePlayer to work standalone, even though it is mainly a replacement for the formerly proprietary networked Squeezebox player. I managed to get this to work thanks to advice from kind users of an audio forum as well as a lot of searches for documentation and other forum info. In brief, I eventually succeeded, and the resulting player works really well, scrolling quickly through my big library and allowing me to select the right album for what I'm working on those precious times I'm not in meetings. Here's what I did:
  1. Downloaded piCorePlayer 3.5.0.
  2. After downloading and unzipping, used Etcher on my Macbook Pro and a SD-2-microSD carrier to flash the piCorePlayer image onto a 64GB SanDisk microSD.
  3. Inserted the card into the Pi 2 microSD carrier, reassembled the unit, connected it to my home LAN, DAC, USB SSD, and power.
  4. On boot-up, boot messages on touchscreen are upside-down. Don't worry, it will be solved later.
  5. Connected to the piCorePlayer software running on the Pi with the Web browser (Chrome) on my Macbook Pro. For convenience (it will come especially handy later), I assigned a fixed IP on my LAN to the Pi, which is really easy to do with the Web interface to the Ubiquiti EdgeRouter that manages my home LAN.
  6. Enabled Beta software on the main piCorePlayer control Web page. This will come in handy later.
  7. Using the piCorePlayer Web interface, installed LMS. This requires resizing the boot partition, which involves several rebooting dialogs, and then actually installing LMS.
  8. Set up your preferred name for the piCorePlayer, and also tell LMS about it on the LMS configuration page. Mine is "Ebnefluh," after a memorable ski tour I did in April (all my music-related machines are named after peaks I visited on skis).
  9. Make sure that the LMS flag to bypass mysqueezebox is set and saved.
  10. Install jivelite package to manage the touchscreen.
  11. Once jivelite is installed, use its configuration page on your browser to adjust screen rotation. In my setup, I had to select the option to flip upside-down.
  12. Use the LMS Web interface to tell LMS where your music library is in the USB drive, and to get it to index it. There's a page that shows indexing progress. If you did 9 above, you won't be prompted to get a mysqueezebox account.
  13. Wait until your music is indexed. In my case, this stopped somehow, and I had to poke it on the LMS Web interface. But it got eventually done.
  14. Now you can test that you can control play to your DAC from the touchscreen. Enjoy testing with some known tunes!
  15. Just to be sure everything so far is remembered, use the "Backup" option on the piCorePlayer control page to save your current configuration to the microSD card.
  16. The final step is to make your device work standalone. On the beta options on the piCorePlayer control page, click to set a fixed IP address. That gets you to a network configuration Web page. Set DHCP to off, enter your fixed IP address, netmask, default gateway, and default DNS. You should set this to what you have on the LAN you are configuring the device on, so that it talks correctly to it when you bring the device back to it for software updates etc.
  17. Backup your whole configuration again to microSD. This is critical!
  18. Shut down the device, unplug it from power, and let it rest for a while so that its RAM resets. Also unplug it from your network.
  19. Power up the device again. Once it is up, you should be able to access your music and control play from the touchscreen.
  20. Troubleshooting: at step 19, if you see a boot-up message that the device is waiting for network and that stays for a while, outputting periods on the screen, that means that you did on 16 did not stick, probably because you forgot to backup the configuration to microSD before power down. If that is done correctly, the network should come up right away, and LMS will also boot-up quickly. If not, after the long failed wait for network, LMS will not spin up and Squeezelite won't find your music.

Friday, May 25, 2018

Seven days of recorded music explorations

Alessio Bax / Southbank Sinfonia / Simon Over Beethoven: Piano Concerto No. 5; Works for Solo Piano
Angelika Niescier Trio The Berlin Concert
Brad Mehldau Trio Seymour Reads the Constitution!
Cyro Baptista Vira Loucos: Cyro Baptista Plays the Music of Villa Lobos
Gallicantus, Gabriel Crouch Sibylla
Jeff Parker Slight Freedom
Jeff Parker The New Breed
Jerusalem Quartet Debussy: Quatuor, Op. 10; Ravel: Quatuor
Jory Vinikour Bach: Goldberg Variations
Joshua Redman Still Dreaming
Leila Josefowicz / David Robertson / Saint Louis Symphony Orchestra John Adams: Violin Concerto
Marc Sinan/Oguz Buyukberber White
Stephen Schultz / Jory Vinikour J.S. Bach: Sonatas for Flute and Harpsichord
Wil Blades Field Notes

Saturday, June 10, 2017

A (computational) linguistic farce in three acts


I had not blogged for 3 years. Many plausible excuses, but the big reason is that it is easier to dash a tweet or a short incidental social media post than to structure a complex argument, which uses mental resources that I need full time at work. But the argument about deep learning, natural language, publication styles and venues that Yoav Goldberg posted on Medium reminded me of something that one day (not today) I would like to get to, the complicated, sometimes hazy, often contentious history of the science and engineering of language as a computational process (I know, I know, even putting it that way could trigger many social scientists and philosophers, but this is just a blog post, not a treatise).

I call this a farce not in a derogatory way, but for its many misunderstandings and pratfalls, in the best traditions of comedic theatre, opera, and silent movie. Who has not stepped on a rhetorical rake in heated academic discussion may cast the first water balloon. After all, these debates issued from the very serious work of intellectual giants of the 50s and early 60s: Kleene, Shannon, Harris, McCulloch, McCarthy, Minsky, Chomsky, Miller, ... One day I'd love to see a careful, thoughtful intellectual history of the origins of AI in general and of the computational turn in language in particular, but we don't have one, so I'm free to make up my own comedic version.

Act One: The (Weak) Empire of Reason

Much of the work on computational models of language and language processing until the 80s was based on an implicit or explicit hope that relatively simple algorithms would capture much of what mattered. Researchers (including me) created models and algorithms that claimed to capture the “essential” phenomena in a modular, compositional way. Once that was done, practical applications would follow easily, since the nice combinatorics of compositionality would cover the infinitely many ways people express meaning.

That was nice, but there was the nagging problem that none of those models or systems could parse, let alone usefully interpret most of the language occurring in the wild. Even back then, artificial neural network fans argued that those crisp formal models of language failed because they did not have enough “flex” at their joints. That led to some epic food fights, but the reality is that NN models, algorithms, but mostly the puny computers and datasets we played with back then could not even match those carefully handcrafted rule systems.

One (temporary) escape from this mismatch between models and actual language was to turn ourselves into formal linguists (I did that too) and argue that we were using computational tools to investigate the core of language, leaving that wild mess of actual language for later decades when we'd have finally dug up the keys to the treasure house. This was a nice detour for both symbolic and neural-network researchers, and it had a not totally unreasonable methodological defense in that, say, physicists also investigated simplified systems (oh, that physics envy!) Of course, this sidestepped the uncomfortable feeling that language, as an evolved biological and social phenomenon, might not have a simple description at all. Incidentally, there's a bit of a parallel here with how biology and biomedical research went on a “simplicity” trip after the discovery of the genetic code (and even later after the sequencing of the human genome), only to keep being foiled by the daunting mess that evolution has left us. Nevertheless, I'd still argue that some of the descriptive models of language developed then still capture the range of certain actual combinatorial possibilities in language at a level of detail that has not been bested. That's mainly a story for another time, except that the lure of simplified settings and models comes back in Act Three.

The field was very small back then. Everyone knew everyone, even those who might despise each other's work and say it loudly in ACL question periods. As a result, a few powerful arbiters of research taste set the tone for each sub-community. Combined with the limited means of research circulation then, that led to small, cohesive cliques. When such a group captured control of research resources (funding, plum academic or industrial roles), as did happen, alternative ideas did not have much room to grow.

Act Two: The Empiricist Invasion or, Who Pays the Piper Calls the Tune

The not insignificant research funding that computational research on language had received from the late 70s to the late 80s, combined with changes in research funding climate (a whole interesting story in itself, but too long and twisted to go into here) created an opening for bold invaders to convince funders that the Emperor of Reason had been committing research in the altogether.

The empiricist invaders were in their way heirs to Shannon, Turing, Kullback, I.J. Good who had been plying an effective if secretive trade at IDA and later at IBM and Bell Labs looking at speech recognition and translation as cryptanalysis problems (The history of the road from Bletchley Park to HMMs to IBM Model 2 is still buried in the murk of not fully declassified materials, but it would be awesome to write — I just found this about the early steps that could be a lot of fun). They convinced funders, especially at DARPA, that the rationalist empire was hollow and that statistical metrics on (supposedly) realistic tasks were needed to drive computational language work to practical success, as had been happening in speech recognition (although by the light of today, that speech recognition progress was less impressive than it seemed then). It did not hurt the campaign that many of the invaders were closer to the DoD in their backgrounds, careers, and outlooks than egghead computational linguists (another story that could be expanded, but might make some uncomfortable). Anyway, I was there in meetings where the empiricist invaders allied with funders increasingly laid down the new rules of the game. Like in the Norman invasion of England, a whole new vocabulary took over quickly with the new aristocracy.

In hindsight, the campaign of 1987-89 and the resulting new order were quite entertaining (even if they did not seem so to the invaded at the time) and brought new cultural devices that were objectively more effective in defining measurable progress, if quite stressful for funding recipients. Personally, I had already started my own journey from somewhat skeptical rationalism to somewhat skeptical empiricism, and would leave the Government-funded research world for the next 12 years, so the conflict was a great opportunity to develop a more distanced view of both the old and the new culture.

The empiricist ascendancy was fortunate (or prescient) in riding the growth of computing resources and text data that also enabled the Web explosion and the flooding of all this work with new resources for funding research, software development, and corpus creation. The metrics religion helped funders sell progress to the holders of the purse strings, and there were real (if not as extensive as sometimes claimed) practical benefits, especially in speech recognition and machine translation. As a result, the research community grew a lot (my top-of-the-head estimate is around 5x from 1990 to 2010).

One curious byproduct of the empiricist ascendancy that is relevant to the present conflict is that measurement became a virtue in itself, sometimes quite independently of what was really being measured. Many empiricist true believers just want the numbers, regardless of whether they correspond to anything relevant to actual language structure and use. Although Penn Treebank metrics are most often brought up for this critique, there are much worse offenders that I will omit in the interests of both not offending by naming without proper arguments, and of not spending my whole weekend on this. In summary, a certain metric fetishism arose that still prevails today for instance in conference reviewing, with the result that interesting models and observations are dismissed unless they improve one of the blessed metrics. Metrics became publishing gatekeepers, easy to apply without thinking, and promoting a kind of p-hacking culture that demeaned explanation and error analysis. Worst, for a practitioner, was that all the metrics are averages, when large deviations is what really matters if you are responsible for a product that should have very low chance of doing something really bad.

Which brings us to the final act.

Act Three: The Invaders get Invaded or, The Revenge of the Spherical Cows

Empiricist practice has a deep flaw that is rarely discussed. At its origins, cryptanalysts were working on sources very restricted in content, and vocabulary. After all, it was not so likely that Enigma traffic would have much outside the military order of the day, weather, and the like, or go full Jabberwocky. When you get enough token counts for traffic like that, you pretty much know everything you can know. It's what Harris profoundly noted in the differences between technical languages and general language. Popular tasks of the empiricist era, from ATIS to PTB, were similarly restricted (travel, business news, ...). What this means is that typical count-based empiricist methods do much better on their own benchmarks than in real life. Just try to parse the Web (let alone social media or chat) with a PTB parser to see what I mean.

Where a lot of training data can be collected in the wild — most notably, parallel translation corpora — empiricist practice with enough counts limps along the long tail, although it's touch-and-go when the counts get small, as they always do.

Another way to see this is that empiricist dogma was protected from its own demise by a rather convenient choice of evaluations. Those of us who have worked hard to apply these methods to real data know well the struggle with small token counts, and the dismaying realization that fancier statistical methods (such as latent-variable models) are most often a waste of effort because in practical situations, a flat count-based model (or linear models) can do as well as can be hoped with the data at hand. Those of us who thought a bit more about this started to realize that token counting and its variants could not generalize effectively across “similar” tokens. We tried many different recipes to alleviate the problem (eg. class-based language models), but they were all ineffective or computationally infeasible (I know, I co-authored quite a few papers in that mode, including at least a couple of best papers — goes to show the limited horizons of program committees).

It's here that deep learners carrying warming GPUs descended from the Northern wastes to lay siege to the empiricist (bean) counters and their sacred metrics. First language modeling, then machine translation fell, in no little measure thanks to their ability to learn usage and meaning generalizations much better than counting could. The modularity of NN models made it easier to explore model design space. Recurrent gated models (thank you Hochreiter and Schmidhuber!) managed history in a much more flexible and adaptable way than any of the history-counting tricks of the previous two decades. It was a rout.

The excitement of the advance was irresistible. Researchers involved, experiments, and papers grew fast, in my estimate 4x from 2010 to 2017. Publication venues overflowed, and researchers burning ever more fossil fuel with their GPUs (morality tale warning!) turned to arXiv to plant ever more flags on their marches through newly conquered lands (not the most culturally apt of behaviors, it must be admitted).

But was the invasion so glorious? Very few of the standard tasks have the very large training sets of language modeling or translation that large-scale SGD depends on. Some tasks with carefully created training sets, like parsing, showed significant but not as striking gains with deep learning. There are exciting results in transfer learning (such as the zero shot translation results), but they rely in starting from models trained on a whole lot of data. However, when we get to tasks for which we have only evaluation data, where count-based models can still do decent work (clustering, generative models), deep learning does not have yet a superior answer.

For continuous outputs, GANs have made a lot of progress. At least, the pictures are stunning. But as I discovered when I worked on distributional clustering, stunning is very much in the eye of the beholder. Proxies like the word association tasks so popular in evaluating word embeddings are almost embarrassingly low-discrimination compared with the size of the models being evaluated. Sensible ways of evaluating GANs for text are even scarcer.

In defense of the Northern hordes, the empiricist burn-it-to-the-ground campaign left little standing that could promote a new way of life. Once the famous empiricist redoubts are conquered or at least laid siege to, how does the campaign continue?

Idea! Let's go back to toy problems where we can create the test conditions easily, like the rationalists did back then (even if we don't realize we are imitating them). After all, Atari is not real life, but it still demonstrates remarkable RL progress. Let's make the Ataris of natural language!

But now the rationalists converted to empiricism (with the extra enthusiasm of the convert) complain bitterly. Not fair, Atari is not real life!

Of course it is not. But neither is PTB, nor any of the standard empiricist tasks, which try strenuously to imitate wild language (their funding depends on it!) but really fail, as Harris predicted back in the 1950s. Or even the best of descriptive linguistics, which leaves in the murk all those messy deviations from the nice combinatorics of the descriptive model.


The mysticism of Mozart's The Magic Flute makes me queasy, and honestly the opera is longer than it should be (at least on an uncomfortable concert hall seat). But the music, and the ultimate message! The main protagonists struggle for and eventually reach enlightenment along their different paths. We are very far from Dann ist die Erd' ein Himmelreich, und Sterbliche den Göttern gleich (thank you neural MT for checking my quotation from the original), but we have been struggling long enough in our own ways to recognize the need for coming together with better ways of plotting our progress.

Friday, May 16, 2014


Slowly, I've been trying to get into the video 21st century. I've finally got a Blu-ray video player (Oppo BDP-103), just wired it to our TV and audio system. The only video I could find at home to test it was a DVD of the backcountry ski movie Sanctified. Its opening with words by Bob Athey struck me with surprising force for its old-school feel, even though it's only nine years old. The late Doug Coombs makes an appearance, and much of the words are for keeping the slopes wild, interspersed with the inevitable perfect backcountry powder runs. Maybe it's my frustratingly slow recovery from the return of my twisted lower back after unprecedented (for me) skiing 29 days out of 80 in four countries, three continents, mostly with no or just partial lift help, maybe it is the sadness of missing the usually glorious spring skiing of the Sierra and the Cascades because of my bad back and of the exceptionally dry winter, or maybe it's just nostalgia for the innocent silliness of the defunct TelemarkTips bboard, but this movie makes it all seem simpler, less ambiguous, and makes me long for those days when I had become just got good enough on skis to become totally besotted with the mountains and the sport, and managed to enjoy places and conditions that I'd not even bother to consider today.

I want to go skiing, but I have no idea when my back will let me.