By John Timmer | Published: October 18, 2007 – 09:01PM CT
The study used Wikipedia as a model open contribution system, and used the French and Dutch versions as the source of its raw data. It focused on the difference between anonymous users and those with registered accounts; only the latter could hope to cash in contributions for reputation. Because of the potential for accumulating reputation, the researchers hypothesized that the registered contributors (who they termed “zealots”) would put effort into the quality of their contributions, which would result in an increase in quality proportional to the amount of contributions.
In contrast, anonymous contributors were expected to fall into two groups. The group with higher rates of contributions would include spammers and vandals, so quality was expected to be low. But the infrequent anonymous contributors should include what the authors termed “good samaritans”—those who make a rare contribution on a subject of expertise, solely because they care about the topic. The authors suggest that this should result in anonymous contributors with the fewest posts having very high quality.
Based on their methodology (which I’ll discuss below) this is precisely what they found. Good samaritans with less than 100 edits made higher-quality contributions than those with registered accounts and equal amounts of content. In fact, anonymous contributors with a single edit had the highest quality of any group. But quality steadily declined, and more-frequent anonymous contributors were anything but samaritans; their contributions generally didn’t survive editing. In contrast, the zealots’ quality improved proportionally to their contributions, so that those credited with over 100 edits consistently outpaced the quality of equivalent anonymous editors.
Problems of method
The conclusions seem fairly clear, and match well with intuitive expectations. The methodology of the study, however, calls the whole enterprise into question. For one, the identification of anonymous users across contributions was tracked by the IP address logged. In most markets, including the Netherlands, IP addresses are commonly assigned dynamically, meaning that a single IP address could have been assigned to a variety of users during the period studied. In addition, this approach would pool all contributors on a network that shares a single point of access to the Internet, such as a free hotspot or a university.
The assessment of quality is also problematic, as it was divorced from any semantic content, and simply evaluated in terms of the persistence of a user’s contribution at the character level. Thus, radical changes in meaning caused by further edits would still allow a contributor to obtain a high “quality” score. To give a concrete example, the insertion of the word “not” in a sentence could completely reverse the original meaning while leaving all of the original contribution intact. The authors also recognize that contributions in the form of stubs on obscure topics might survive unaltered indefinitely, inflating the importance of single contributions. I’d also note that it’s frequently easier to rewrite existing material than to generate it from scratch, a phenomenon that would artificially inflate quality scores.
Objective ratings of quality are difficult, and it’s hard to fault the authors for attempting to find an easily-measured proxy for it. In the absence of independent correlation, however, it’s not clear that the measurement used actually works as a proxy. Combined with the concerns regarding anonymous contributor identity, there are enough problems with this study that the original question should probably be considered unanswered, regardless of how intuitively satisfying these results are.