The idea of building a better search engine sounds almost laughable on the surface.
After all, isn’t there already a massively successful internet search player with a seemingly insurmountable market share? But to hear Jimmy Wales, co-founder of Wikipedia and chairman of the for-profit wiki site Wikia, describe his vision of a totally transparent social search engine — one built with open-source software and inspired by the collaborative spirit of wikis — you realize that his plan just might work.
Wales’ plan for the Search Wikia project is to put ordinary users in charge of ranking search results. Heavy lifting such as indexing and raw ranking will still be done by machines, but the more nuanced work of deciding how search results are displayed will be completed by humans.
Google, the current King of Search, ranks search results based on the perceived trust of the web community at large — the more links a page receives, the more it’s trusted as an authoritative source of information, and the higher the rank. However, this method is open to tinkering, trickery and hacks, all of which damage the relevancy of results.
If successful, Wales’ project, which launches in early 2007, will be able to filter out such irrelevant results. Operating much the same way as Wales’ Wikipedia, both the software algorithms powering Search Wikia and the changes applied by the community will be made transparent on the project’s website.
Wired News spoke to Jimmy Wales about Search Wikia. We discussed the ins and outs of how the model will likely work, what it will take to build it, and what sorts of criticisms it will face.
Wired News: Can you describe the new search engine in your own words?
Jimmy Wales: The core of the concept is the open-source nature of everything we’re intending to do — making all of the algorithms public, making all of the data public and trying to achieve the maximum possible transparency. Developers, users, or anyone who wants to can come and see how we’re doing things and give us advice and information about how to make things better.
Additionally, we want to bring in some of the concepts of the wiki model — building a genuine community for discussion and debate to add that human element to the project.
I mention “community” to distinguish us as something different. A lot of times, when people talk about these kinds of (projects), they’re not thinking about communities. They’re thinking about users randomly voting, and that action turning into something larger. I really don’t like the term “crowdsourcing.” We’re really more about getting lots of people engaged in conversations about how things should be done.
WN: How are the communities going to be managed?
Wales: I don’t know! (laughter) If you asked me how the Wikipedia community is managed, I wouldn’t know the answer to that, either. I don’t think it makes sense to manage a community.
It’s about building a space where good people can come in and manage themselves and manage each other. They can have a distinct and clear purpose — a moral purpose — that unites people and brings them together to do something useful.
WN: How will the human-powered ranking element work?
Wales: We don’t know. That’s something that’s really very open-ended at this moment. It’s really up to the community, and I suspect that there won’t be a one-size-fits-all answer. It will depend on the topic and the type of search being conducted.
One of the things that made Wikipedia successful was a really strong avoidance of a priori thinking about exactly “how.” We all have a pretty good intuitive sense of what a good search result is. A variety of different factors make a search result “good,” qualitatively speaking. How we get those kinds of results for the most possible searches depends on a lot of factors.
A lot of the earlier social search projects fell apart because they were committed a priori to some very specific concept of how it should work. When that worked in some cases but not others, they were too stuck in one mold rather than seeing that a variety of approaches depending on the particular topic is really the way to do it.
WN: I’m envisioning that Wikia Search will incorporate some sort of voting system, and that users will be able to adjust and rank lists of results. Is this the case?
Wales: Yes, but how exactly and under what circumstances that would work is really an empirical question that we’ll experiment with. At Wikipedia and in the wiki world, one of the things we’ve always pushed hard against is voting. Voting is usually not the best way to get a correct answer by consensus. Voting can be gamed, it can be played with. It’s a crutch of a tool that you can use when you don’t have anything better to use. Sometimes, there is no better way. You have to say, “We’ve tried to get a consensus and we couldn’t, so we took a vote.”
In general, envisioning some sort of pre-built algorithm for counting people’s votes is just not a good idea.
WN: Speaking of gaming, what methodologies do you think Search Wikia will employ to fight gaming?
Wales: I think the most important thing to use to fight against gaming is genuine human community. Those kinds of gaming behaviors pop up when there is an algorithm that works in some mechanical way, and then people find a way to exploit it. It’s pretty hard to do that within a community of people who know each other. Basically, if you’re being a jerk, they’ll tell you knock it off and you’ll be blocked from the site. It’s pretty simple for humans to see through that sort of thing. The real way to fight it is to have a group of people who trust each other, with that trust having been built over a period of time.
WN: Will there be some sort of validation that happens when results are ranked by users? Will knowledgeable contributors get the chance to vet changes?
Wales: Yes. The keys of good design here have to do with transparency — everybody can see what everyone else has done. The communities will have the ability to effect and modify changes as they see fit.
WN: What forms of open-source software are you applying to this search project, and why do you think those would be more successful than proprietary search software?
Wales: Here’s the main thing. If we publish all the software — and we’ll be starting with Lucene and Nutch, which are these open source projects that are out there and already quite good — and do all of our modifications transparently in public, then other programmers can come and check the code. If you see things that aren’t working well, you can contribute. People who are coders can contribute in one way, and ordinary people using the site can also contribute in other ways.
It’s mostly about the trust that you get from that transparency. You can see for yourself, if you choose to dig into it, how things are ranked and why certain results are ranked the way they are. You can also choose to download the whole thing and do tests or tweak it to make it better in certain areas. That kind of transparency helps if you see a problem with search in some area that you care about, like some technical field for example. There’s no good way for you to go and tell Google that their search is broken in this area, or that they need to disambiguate these terms — or whatever.
By having an up-front commitment to transparency, I think you can do that.
WN: One of the key arguments in favor of a new search model is that traditional search engines like Google are subjected to spam more and more often. How can a wiki-powered search engine better fight search spam?
Wales: Again, I think it’s that human element. Humans can recognize that a domain is not returning good results, and if you have a good community of people to discuss it, you can just kick them out of the search engine. It seems pretty simple to me — it’s an editorial judgment. You just have to have a broad base of people who can do that.
WN: How are you going to build this broad base? Will there be an outreach, or are you expecting people to just come to you?
Wales: I think people will come. If we’re doing interesting work and people find it fun, then people will come.
WN: When do you expect to see Search Wikia up and running?
Wales: The project to build the community to build the search engine is launching in the first quarter of 2007, not the search engine itself. We may have something up pretty quickly, maybe some sort of demo or test for people to start playing with. But we don’t want to build up expectations that people can come in three months and check out this Google-killing search engine that we’ve written from scratch. It’s not going to happen that fast.
What we want to do now is get the community going and get the transparent algorithms going so we can start the real work. It’s going to be a couple of years before this really turns into something interesting.