Archive for the ‘search’ Category

Great strides made in search engine privacy, says report

By Jacqui Cheng | Published: August 09, 2007 – 11:27AM CT

“Privacy” is the name of the game among US search engines these days, and the Center for Democracy and Technology (CDT) is pleased with the progress that has been made so far. In a report released yesterday, titled “Search Privacy Practices: A Work in Progress” (PDF), the CDT outlined some of the changes that the five major search engines have made in order to be more conscious of privacy concerns. The organization also pointed out that the major search engines are beginning to aggressively compete with each other in order to provide the “best” privacy protections for their users.

“We hope this signals the emergence of a new competitive marketplace for privacy,” said CDT president Leslie Harris in a statement. “By themselves, these recent changes represent only a small step toward providing users the full range of privacy protections they need and deserve, but if this competitive push continues it can only stand to benefit consumers.”

The report reiterated many of the recent search policy changes that have made headlines in the last several months. In June, Google agreed to anonymize its search records after 18 months instead of 24 (or previous to that, never). That announcement was followed by one from Ask.com in July, which also said that it would also anonymize its data after 18 months, and then Microsoft just days later—also 18 months. Both AOL and Yahoo! have also agreed to shorten the length of time they keep records around, undercutting the others by anonymizing records after just 13 months.

The report also cited Ask.com’s new AskEraser tool as offering a level of user control that the others do not. AskEraser is a preference that users can set on the site, ensuring that absolutely no search records will be retained for that user past a few hours. CDT praises Ask.com for AskEraser and points out that while the others offer options to their users to extend the length of time their search records are stored, no others currently allow users to choose not to have records retained. CDT recommends that other search engines “continue to work towards providing controls that allow users to not only extend but also limit the information stored about them.”

The CDT provides other recommendations as well. While the organization acknowledges that some search engines have legitimate reasons for keeping data around for advertising purposes, it says that those companies need to store the data securely (hello, AOL) and provide notice to their users about what is being stored and for how long. The CDT also says that the search engines should work together to promote privacy protections “across the board” with smaller partners.

Despite the progress that has been made, however, the CDT still feels that there is a need for stronger privacy legislation. “No amount of self-regulation in the search privacy space can replace the need for a comprehensive federal privacy law to protect consumers from bad actors,” the report says. “With consumers sharing more data than ever before online, the time has come to harmonize our nation’s privacy laws into a simple, flexible framework.”

Read Full Post »

Wikipedia founder to create user-driven search engine

By Ryan Paul | Published: December 25, 2006 – 05:51PM CT

Wikia, a for-profit corporation created by Wikipedia founder Jimmy Wales, is preparing to launch a search engine that will leverage the user-driven model that has contributed to the massive success of the Wikipedia.

The Wikia corporation plans to launch the search engine with financial backing from online retailer Amazon (some sites are erroneously reporting that Amazon is involved in the development process, but please note that they are just providing financial resources for Wikia at this point) and a handful of other technology companies, but Wales hopes to generate profit from the service with advertising. In an interview with Times Online, Wales says that “the revenue model for search is advertising,” a truism demonstrated by competitors Google and Yahoo. Does Wikia have what it takes to beat the best at their own game? Wales is hoping that the reputation of Wikipedia and the transparency of the user-driven approach will be enough to attract users.

According to Wales, conventional search engine ranking algorithms lack the efficacy of human intervention. “Essentially, if you consider one of the basic tasks of a search engine, it is to make a decision: ‘this page is good, this page sucks’,” says Wales, “Computers are notoriously bad at making such judgments, so algorithmic search has to go about it in a roundabout way.” Wales also complains about poor results from mainstream search engines, commenting: “Google is very good at many types of search, but in many instances it produces nothing but spam and useless crap.”

Although many consider Wikipedia to be a useful tool, Wales himself is one of many who insist that the web-based community encyclopedia shouldn’t be treated as an authoritative source. The quality and accuracy of Wikipedia content has been questioned on numerous occasions and the site has stirred up controversy more than a few times in the past. Most Wikipedia contributors have seen edit wars and outright manipulation transpire even within articles that don’t address controversial topics. A prank by television comedian Stephen Colbert, for instance, led to mass vandalism earlier this year. When one considers the competitive advantages of high search engine placement and the growing number of search engine “optimization” firms that specialize in improving a site’s page rank, one begins to wonder how Wikia plans to prevent the system from being exploited.

The commercial nature of the new service could also potentially deter users. It is worth noting that Wikipedia users overwhelmingly rejected the use of advertising on Wikipedia when it was suggested as a potential means of funding future growth for the site. Users may be reluctant to contribute to the betterment of a commercial site that may end up being bought by a bigger company. Consider, for example, the tragic death of TV Tome, a comprehensive community-driven television content guide that was eventually bought by CNET and transformed into a garish, excessively commercialized Web 2.0 monstrosity of significantly less value to users.

Even commercially successful user-driven web services have challenges of their own. Digg.com, for instance, battles spammers who attempt to mass-“vote” content onto the front page. At the other end of the spectrum, the most popular items on Digg are typically bizarre novelties; of the top fifteen Digg stories from the last 30 days, four relate to Digg itself, four relate to the tragic death of James Kim, and the rest are mostly humor. My point is that popular content is not synonymous with useful or informative content (no, discovering YouParkLikeAnAsshole.com on Digg hasn’t tangibly improved my life in any way), so a search engine that ranks pages by popularity rather than some measure of value (Google’s algorithm is an attempt at this) may produce useless results.

A user-driven search engine service has a lot of potential if it is done correctly. Despite all of its problems, Wikipedia is still an extremely valuable resource and a site that I personally visit an average of three to five times a day. To compete with Google, Wikia will have to keep the advertising simple and focus on making Wikipedia’s neutral-point-of-view policy work for search rankings. If that happens, Google might have some real competition on its hands.

Discuss Print

Read Full Post »

Wikia acquires Grub distributed search indexing system

By Ryan Paul | Published: July 30, 2007 – 08:17AM CT

Wikia, the company created by Wikipedia founder Jimmy Wales, has acquired the Grub distributed indexing system from LookSmart and is preparing to distribute Grub’s code under an open-source license. Wikia plans to use Grub for its user-driven search engine, which is still under development.

Originally created in 2000, Grub leverages the distributed computing model to crawl the web and index pages. Users install a specialized client application on their computer, which then automatically performs indexing while idle and transmits page data back to a centralized repository. In this manner, volunteers will contribute the raw computing power that performs the indexing.

Wikia is resurrecting Grub as an open source project and hopes to work with the open source software community to create ports of the Grub client—which currently only runs on Windows—to other operating systems. Wikia hopes that the modular nature of Grub and the availability of source code will make it possible for users to add features and help improve the system’s performance.

In addition to leveraging volunteer computing power for automated indexing, Wikia’s search engine will also attempt to take advantage of human power for index editing and refinement. According to Wales, users of the Wikia search engine will be involved in adding and removing links, removing spam, and policing other users much like the participatory model used by Wikipedia today.

“The desire to collaborate and support a transparent and open platform for search is clearly deeply exciting to both open source and businesses,” said Wales in a statement. “Look for other exciting announcements in the coming months as we collectively work to free the judgment of information from invisible rules inside an algorithmic black box.”

Wikia’s search engine isn’t yet available for use, but the project’s mission is articulated on the Wikia search page. With goals like transparency, community, quality, privacy, and interoperability, Wikia’s search service seems promising at first glance, but despite the potential value, there are many problems that the company will face when the search engine launches.

Search engine ranking has significant financial implications for many companies, so it’s likely that Wikia’s user-driven search engine will face constant attempts at manipulation. Keeping the spammers and search engine optimization hackers at bay is sure to be a taxing endeavor. Considering the vehemence with which Wikipedia users have traditionally opposed using ads rather than donations to fund Wikipedia, it’s not entirely clear that an ad-based commercial project like Wikia’s search engine will attract the same degree of user involvement.

Distributed computing is a highly unusual approach to indexing, but it’s also consistent with Wikia’s participatory model. Regardless of whether or not Wikia’s search engine succeeds, the company’s willingness to experiment with unconventional approaches could spur innovation and change the landscape of the search engine market.

Discuss Print

Read Full Post »

Older Posts »

%d bloggers like this: