By John Timmer | Published: February 15, 2007 – 12:30PM CT
Its first effort (currently under testing) is called WikiProteins and targets the biology community. In contrast to other efforts, WikiProteins will have useful and accurate content from the second it opens its virtual doors. That’s because its creator has imported and data mined content from a number of the world’s leading databases of biological information, including PubMed, UniProt, and the National Library of Medicine. Thus, the entry for every gene will already contain information on topics such as its functional domains, areas of expression, and publications that mention it. According to WikiProteins, the combination of databases yielded 2 million factual relationships to be mined, which produced over 5 billion relationship pairs.
All of that is before anyone has been given a chance to edit anything. In contrast to some of the other efforts at improving wiki content, WikiProteins will allow anyone to add or edit content, but any editing will set off a chain of events with some interesting consequences. Let’s say that you discover your favorite gene (YFG) helps nerve cells migrate in the spinal cord. When you add that information to WikiProteins, a background process will immediately index its content, intelligently recognizing key terms in a semantic-independent manner. Even though the editing occurs at the entry for YFG, WikiProteins’s entries on migration, neurons, and spinal cord will all be updated.
As with Wikipedia, users can sign up for alerts when content changes, but a key difference comes in how these alerts are managed. Instead of tying them to changes in a specific page, WikiProteins will send out an alert to anyone interested in a topic, no matter where within the system those changes took place. In the example above, anyone interested in migration will receive an alert of the changes to the YFG page, even if they’d never heard of the gene previously. Not only is this potentially more useful to users, but it should improve the quality of content, as the community of researchers interested in migration can police content they might never be aware of in a standard wiki system.
Of course, it’s also possible that such alerts may be one more item in the already-overloaded inboxes of most biologists. Annotated and curated information has a long history in biology, but it tends to come in one of two forms: either special, one-time projects done by researchers, such as the annotation work performed when a genome is completed, or long-term updating work performed by full-time curation staff, such as those who perform the updates to projects like Wormbase and Flybase. It’s not clear whether the most knowledgeable researchers will devote the energy to perform updating over the long-term, updating that WikiProteins will require to maintain its quallity.
It’s also not clear whether the WikiProteins model can be applied to areas outside the sciences, which may not have equivalent amounts of information already in the public domain. Part of the appeal of WikiProteins is that it will provide valuable information from day one, and provide a single interface to multiple source of data. Without that sort of backing information to provide a foundation, other fields might find that the WikiProfessional model doesn’t work.