Wednesday, February 18, 2009

Decision by Vetocracy

Decision by Vetocracy: Few would mistake the process of academic paper review for a fair process, but sometimes the unfairness seems particularly striking. (Via Machine Learning (Theory).)

I've also seen quite a few instances of bad, even unprofessional, reviewing in the last few years. However, I don't need to hypothesize a mechanism involving an obsessive vetoing minority to explain the deterioration of conference reviewing. Instead of individually malicious agents, the trends can be explained more globally by system overload. The demand for reviewing services is growing faster than the supply of mature, experienced reviewers. That's simply an effect of the field's demographic pyramid: increasingly large cohorts of relatively inexperienced researchers submitting papers to be reviewed by the much smaller cohorts of their seniors. As a consequence, either an increasing number of reviews are assigned to relatively unqualified reviewers or the senior cohort is overwhelmed and produces lower quality reviews. Network effects add to this: relatively senior PC and area chairs know best and are more likely to are reviewing favors from those in their and neighboring cohorts, increasing the chance that successive versions of the same paper will be assigned to the same reviewer drawn from that relatively small pool, bidding system or not. The problem is exacerbated by the very high peak reviewing load demanded by having a few large conferences where all the reviews have to be done in a month or so. Basically, we have a very congested network causing a lot of retries and lost pa{pers|ckets}.

The standard solution for this problem is that subfields split off and start their own meetings and journals. The new subfields, because they seem risky, attract fewer newcomers to sart with so reviewing quality tends to be higher. Also, subfield founders have a strong sense of ownership and responsibility towards their babies (sometimes too much), so they will work really hard on reviewing and other field-building activities. I saw this pattern when statistical natural language processing started its own series of meetings (such as EMNLP), and also earlier with logic programming and with with learning theory.

An interesting question is whether there are ways to scale up an area of research that do not require fission. For instance, if we were to move to open online paper-and-response systems, as Leon Bottou, Yann LeCun and others have suggested, maybe network effects would work for us in bringing the most discussed ideas to the top rather than against us in creating terminal reviewing congestion. Discussants would choose which papers to review, but because they would be not anonymous, torpedoing a paper would collide with social and professional norms. The worst a malicious agent wanting to stay anonymous could do is to arrange for sock puppets to diss a paper, but if the paper is good, many others would jump to its defense, and a mild level of moderation, editorial or distributed, would be likely to be sufficient to dampen flame wars.

At the very least, a first turn-around electronic journal for short communications, with mechanisms for supporting material and for commentary, might do better than a conference because unlike a conference, a journal has institutional memory in its persistent editorial board. Such a journal could then organize a highlights (main session) and discussion (posters and workshops) conference based on the previous year's accepted papers. I understand that the VLDB community is considering seriously such a model.

Update: John Langford on his blog notes that my argument above requires superlinear growth. In fact, it just needs a relatively short period of superlinear growth (inflation) such that experienced reviewers are those who came into the field before the end of inflation. Eventually, as growth rates flatten out, the ratio of submissions to experienced reviewers will stabilize, or even decrease if the field loses vitality. I've seen subfields in all stages of this trajectory: initial slow growth, inflation if the field takes off, eventual maturity with stable growth, slow down. This trajectory is also supported by Gordon Tullock's cranky but very insightful analysis The Organization of Inquiry. I don't have the numbers at hand, unfortunately, but anecdotally I believe that NIPS had an inflationary period in the 90s, but now growth has flattened.

No comments: