Feeds:
Posts
Comments

Posts Tagged ‘search’

Deconstructing Real Google Searches: Why Powerset Matters

Written by Guest Author / January 9, 2008 1:07 AM / 13 Comments


This is a guest post by Nitin Karandikar, author of the Software Abstractions blog.

Recently I was looking at the log files for my blog, as I regularly do, and I was suddenly struck by the variety of search queries in Google from which users were being referred to my posts. I write often about the different varieties of search – including vertical search, parametric search, semantic search, and so on – so users with queries about search often land on my blog. But do they always find what they’re looking for?

All the major search engines currently rely on the proximity of keywords and search terms to match results. But that approach can be misleading, causing the search engine to systematically produce incorrect results under certain conditions.

To demonstrate, let us take a look at three general use cases.

[Note: The examples given below are all drawn from Google. To be fair, all the major search engines use similar algorithms, and all suffer from similar problems. For its part, Google handles billions of queries every day, usually very competently. As the reigning market leader, though, Google is the obvious target – it goes with the territory!]

1. Difficulty in Finding Long Tail Results

Take Britney Spears. Given the current popularity of articles, news, pictures, and videos of the superstar singer, the results for practically any query with the word “spears” in it will be loaded with matches about her – especially if the search involves television or entertainment in any way.

Let’s say you’re watching the movie Zulu and you start wondering what material the large spears that all the extras are waving about are made of. So, you go to Google and type in “movie spears material” – this is an obviously insufficient description, as the screen shot below shows.

What happens if you expand on the query further – say: “what are movie spears made out of?” – does it help?

The general issue here is that articles about very popular subjects accumulate high levels of PageRank and then totally overwhelm long tail results. This makes it very difficult for a user to find information about unusual topics that happen to lie near these subjects (at least based on keywords).

2. Keyword Ordering

Since the major search engines focus only on the proximity of keywords without context, a user search that’s similar to a popular concept gets swamped with those results, even if the order of keywords in the query has been reversed. For example, a tragic occurrence that’s common in modern life is that of a bicycle getting hit by a car. Much less common is the possibility of a car getting hit by a bicycle, although it does happen. How would you search for the latter? Try typing “car hit by bicycle” into Google; here’s a screen shot of what you get. [Note the third result, which is actually relevant to this search!]

3. Keyword Relationships

Since the major search engines focus only on the keywords in the search phrase, all sense of the relationship between the search terms is lost. For example, users commonly change the meaning of search terms by using negations and prepositions; it is also fairly common to look for the less common members of a set.

This takes us into the realm of natural language processing (NLP). Without NLP, the nuances of these query modifications are totally invisible to the search algorithms.

For example, a query such as “Famous science fiction writers other than Isaac Asimov” is doomed to failure. A screen shot of this search in Google is presented below. Most of the returned results are about Isaac Asimov, even when the user is explicitly trying to exclude him from the list of authors found.

All of the searches shown above look like gimmicks – queries designed intentionally to mislead Google’s search algorithms. And in a sense, they are; these specific queries can be easily fixed by tweaking the search engine. Nevertheless, they do point to a real need: the value of understanding the meaning behind both the query and the content indexed.

Semantic Search

That’s where the concept of semantic search comes in. I attended a media event earlier this year at stealth search startup Powerset (see: Powerset is Not a Google-killer!), at which they showcased a live demo of their search engine, currently in closed alpha, that highlighted solutions to exactly this type of issue.

For example, type “What was said about Jesus” into a major search engine, and you usually get a whole list of results that consist of the teachings of Jesus; this means that the search engine entirely missed the concepts of passive voice and “about.” The Powerset results, on the other hand, were consistently on target (for the demo, anyway!).

In other words, when you look at just the keywords in the query, you don’t really understand what the user is looking for; by looking at them within context, by taking into account the qualifiers, the prepositions, the negatives, and other such nuances, you can create a semantic graph of the query. The same case can be made for semantic parsing of the content indexed. Put the two together, as Powerset does, and you can get a much better feel for relevance of results.

What about Google? I’m sure the smart folks in Google’s search-quality team are busily working on this problem as well. I look forward to the time when the major search engines handle long tail queries more accurately and make search a better experience for all of us.

Update: for an expanded version of this article with real-life user queries, see my blog.

Read Full Post »

Yahoo to Enable Custom Semantic Search Engines

Written by Marshall Kirkpatrick / February 11, 2009 9:14 AM / 2 Comments


Yahoo is bringing together two of its most interesting projects today, Yahoo BOSS (Build Your Own Search Service) and SearchMonkey, its semantic indexing and search result enhancement service. There were a number of different parts of the announcement – but the core of the story is simple.

Developers will now be able to build their own search engines using the Yahoo! index and search processing infrastructure via BOSS and include the semantic markup added to pages in both results parsing and the display of those results. There’s considerable potential here for some really dazzling results.

We wrote about the genesis of Search Monkey here this Spring, it’s an incredibly ambitious project. The end result of it is rich search results, where additional dynamic data from marked up fields can also be displayed on the search results page itself. So searching for a movie will show not just web pages associated with that movie, but additional details from those pages, like movie ratings, stars, etc. There’s all kinds of possibilities for all kinds of data.

Is anyone using Yahoo! BOSS yet? Anyone who will be able to leverage Search Monkey for a better experience right away? Yahoo is encouraging developers to tag their projects bossmashup in Delicious. As you can see for yourself, there are a number of interesting proofs of concept there but not a whole lot of products. Of the products that are there, very few seem terribly compelling to us so far.

We must admit that the most compelling BOSS implementation so far is over at the site of our competitors TechCrunch. Their new blog network search implementation of BOSS is beautiful – you can see easily, for example, that TechCrunch network blogs have used the word ReadWriteWeb 7 times in the last 6 months. (In case you were wondering.)

Speaking of TechCrunch, that site’s Mark Hendrickson covered the Yahoo BOSS/Search Monkey announcement today as well, and having worked closely on the implementation there he’s got an interesting perspective on it. He points out that the new pricing model, free up to 10,000 queries a day, will likely only impact a handful of big sites – not BOSS add-ons like TechCrunch search or smaller projects.

The other interesting part of the announcement is that BOSS developers will now be allowed to use 3rd party ads on their pages leveraging BOSS – not just Yahoo adds. That’s hopeful.

Can Yahoo do it? Can these two projects brought together lead to awesome search mashups all over the web? We’ve had very high hopes in the past. Now the proof will be in the pudding.

Read Full Post »

How Vulnerable is Google on Search?

Written by Marshall Kirkpatrick / February 21, 2008 10:45 AM / 22 Comments


A new wrinkle in the search landscape emerged this morning with the announcement that Ask.com is now offering Compete traffic stats inline for the sites on results pages. (Disclosure: Compete is an RWW advertiser.) This move itself may not shake up search but it does beg the question, how much room for meaningful innovation is there in search and to what degree is Google vulnerable in the market it so dominates?

Ask.com comes up with interesting features all the time that tend not to get a big reaction. This move’s impact is mitigated by the facts that Compete traffic data is limited to US site visitors and the stats aren’t yet available on Ask’s fantastic blog search. None the less, I think it’s an interesting case that demonstrates just how open the future of search remains.

In addition to offering value adds like traffic data, search by semantic or natural language meaning is an option for search that’s widely discussed. Social search is yet another. Researchers at Stanford posted an interesting study this week on the role social bookmarking could play in augmenting search.

On Google

I find myself consistently impressed with a lot of what Google does but the fact remains that Google web search isn’t changing much. They are folding all the various search engines into one, but the experience isn’t changing dramatically. Does it need to? Check out this rant below from Doc Searls, on recent episode of the excellent NewsGang Podcast. Searls calls Google, “the Windows of search.”

I think Google is vulnerable in search. Google hasn’t changed search in 7 or 8 years, they are fat and happy. There are so many ways search can be improved. Google is way too locked into Larry and Sergey’s original vision, which has hardly changed at all; it’s not the only cannonical way to do search. There’s so many ways to granulate search and make it conditional and do a much better job. Google’s search is lame in a lot of ways, it’s very minimal – it’s just become common but that doesn’t mean it’s perfect. It is the Windows of search.There’s a huge vulnerability there. I was talking to someone who used to work at Google who said that the reason Google Blogsearch has been moribund for years…is because Larry thinks that Google ought to have one search experience and that search experience should never change. Since Larry wants it that way, Google Blogsearch is just sitting there and may actually go away. It’s inexcusable, I don’t care how much research they are doing – they are blowing smoke up their own ass if they think that there is only one good experience we can have with search. It is not enough. There is enormous room for other people to compete with that…Get out of your shell where you think the whole world is these companies and what they bring to the table now.

 

Ask’s integration with Compete is just one small example of what’s possible. Searls doesn’t take into consideration Google’s mindshare in the passage above but I agree with the basic premise that some major new feature, algorithm or user experience could prove very compelling for searchers at large. Here at the ReadWriteWeb network, we’ve got a whole blog about alternative search engines.

Google isn’t the most lovable brand in the world and no one can be the coolest cat in school forever.

Read Full Post »

Social Graph & Beyond: Tim Berners-Lee’s Graph is The Next Level

Written by Richard MacManus / November 22, 2007 5:55 PM / 12 Comments


Tim Berners-Lee, inventor of the World Wide Web, today published a blog post about what he terms the Graph, which is similar (if not identical) to his Semantic Web vision. Referencing both Brad Fitzpatrick’s influential post earlier this year on Social Graph, and our own Alex Iskold’s analysis of Social Graph concepts, Berners-Lee went on to position the Graph as the third main “level” of computer networks. First there was the Internet, then the Web, and now the Graph – which Sir Tim labeled (somewhat tongue in cheek) the Giant Global Graph!

Note that Berners-Lee wasn’t specifically talking about the Social Graph, which is the term Facebook has been heavily promoting, but something more general. In a nutshell, this is how Berners-Lee envisions the 3 levels (a.k.a. layers of abstraction):

1. The Internet: links computers
2. Web: links documents
3. Graph: links relationships between people and/or documents — “the things documents are about” as Berners-Lee put it.

The Graph is all about connections and re-use of data. Berners-Lee wrote that Semantic Web technologies will enable this:

“So, if only we could express these relationships, such as my social graph, in a way that is above the level of documents, then we would get re-use. That’s just what the graph does for us. We have the technology — it is Semantic Web technology, starting with RDF OWL and SPARQL. Not magic bullets, but the tools which allow us to break free of the document layer.”

Sir Tim also notes that as we go up each level, we lose more control but gain more benefits: “…at each layer — Net, Web, or Graph — we have ceded some control for greater benefits.” The benefits are what happens when documents and data are connected – for example being able to re-use our personal and friends data across multiple social networks, which is what Google’s OpenSocial aims to achieve.

What’s more, says Berners-Lee, the Graph has major implications for the Mobile Web. He said that longer term “thinking in terms of the graph rather than the web is critical to us making best use of the mobile web, the zoo of wildy differing devices which will give us access to the system.” The following scenario sums it up very nicely:

“Then, when I book a flight it is the flight that interests me. Not the flight page on the travel site, or the flight page on the airline site, but the URI (issued by the airlines) of the flight itself. That’s what I will bookmark. And whichever device I use to look up the bookmark, phone or office wall, it will access a situation-appropriate view of an integration of everything I know about that flight from different sources. The task of booking and taking the flight will involve many interactions. And all throughout them, that task and the flight will be primary things in my awareness, the websites involved will be secondary things, and the network and the devices tertiary.”

Conclusion

I’m very pleased Tim Berners-Lee has appropriated the concept of the Social Graph and married it to his own vision of the Semantic Web. What Berners-Lee wrote today goes way beyond Facebook, OpenSocial, or social networking in general. It is about how we interact with data on the Web (whether it be mobile or PC or a device like the Amazon Kindle) and the connections that we can take advantage of using the network. This is also why Semantic Apps are so interesting right now, as they take data connection to the next level on the Web.

Overall, unlike Nick Carr, I’m not concerned whether mainstream people accept the term ‘Graph’ or ‘Social Graph’. It really doesn’t matter, so long as the web apps that people use enable them to participate in this ‘next level’ of the Web. That’s what Google, Facebook, and a lot of other companies are trying to achieve.

Incidentally, it’s great to see Tim Berners-Lee ‘re-using’ concepts like the Social Graph, or simply taking inspiration from them. He never really took to the Web 2.0 concept, perhaps because it became too hyped and commercialized, but the fact is that the Consumer Web has given us many innovations over the past few years. Everything from Google to YouTube to MySpace to Facebook. So even though Sir Tim has always been about graphs (as he noted in his post, the Graph is essentially the same as the Semantic Web), it’s fantastic he is reaching out to the ‘web 2.0’ community and citing people like Brad Fitzpatrick and Alex Iskold.

Related: check out Alex Iskold’s Social Graph: Concepts and Issues for an overview of the theory behind Social Graph. This is the post Tim Berners-Lee referenced. Also check out Alex’s latest post today: R/WW Thanksgiving: Thank You Google for Open Social (Or, Why Open Social Really Matters).

Read Full Post »

Semantic Travel Search Engine UpTake Launches

Written by Josh Catone / May 14, 2008 6:00 AM / 8 Comments


According to a comScore study done last year, booking travel over the Internet has become something of a nightmare for people. It’s not that using any of the booking engines is difficult, it’s just that there is so much information out there that planning a vacation is overwhelming. According to the comScore study, the average online vacation plan comes together through 12 travel-related searches and visits to 22 different web sites over the course of 29 days. Semantic search startup UpTake (formerly Kango) aims to make that process easier.

UpTake is a vertical search engine that has assembled what it says is the largest database of US hotels and activities — over 400,000 of them — from more than 1,000 different travel sites. Using a top-down approach, UpTake looks at its database of over 20 million reviews, opinions, and descriptions of hotels and activities in the US and semantically extracts information about those destinations. You can think of it as Metacritic for the travel vertical, but rather than just arriving at an aggregate rating (which it does), UpTake also attempts to figure out some basic concepts about a hotel or activity based on what it learns from the information it reads. Things such as, is the hotel family friendly, would it be good for a romantic getaway, is it eco friendly, etc.

“UpTake matches a traveler with the most useful reviews, photos, etc. for the most relevant hotels and activities through attribute and sentiment analysis of reviews and other text, the analysis is guided by our travel ontology to extract weighted meta-tags,” said President Yen Lee, who was co-founder of the CitySearch San Francisco office and a former GM of Travel at Yahoo!

What UpTake isn’t, is a booking engine like Expedia, a meta price search engine like Kayak, or a travel community. UpTake is strictly about aggregation of reviews and semantic analysis and doesn’t actually do any booking. According to the company only 14% of travel searches start at a booking engine, which indicates that people are generally more interested in doing research about a destination before trying to locate the best prices. Many listings on the site have a “Check Rates” button, however, which gets hotel rates from third party partner sites — that’s actually how UpTake plans to make money.

The way UpTake works is by applying its specially created travel ontology, which contains concepts, relationships between those concepts, and rules about how they fit together, to the 20 million reviews in its database. The ontology allows UpTake to extract meaning from structured or semi-structured data by telling their search engine things like “a pool is a type of hotel amenity and kids like pools.” That means hotels with pools score some points when evaluating if a hotel is “kid friendly.” The ontology also knows, though, that a nude pool might be inappropriate for kids, and thus that would take points away when evaluating for kid friendliness.

A simplified example ontology is depicted below.

In addition to figuring out where destinations fit into vacation themes — like romantic getaway, family vacation, girls getaway, or outdoor — the site also does sentiment matching to determine if users liked a particular hotel or activity. The search engine looks for sentiment words such as “like,” “love,” “hate,” “cramped,” or “good view,” and knows what they mean and how they relate to the theme of the hotel and how people felt about it. It figures that information into the score it assigns each destination.

Conclusion

Yesterday, we looked at semantic, natural language processing search engine Powerset and found in some quick early testing that the results weren’t that much different than Google. “If Google remains ‘good enough,’ Powerset will have a hard time convincing people to switch,” we wrote. But while semantic search may feel rather clunky for the broader global web, it makes a lot of sense in specific verticals. The ontology is a lot more focused and the site also isn’t trying to answer specific questions, but rather attempting to semantically determine general concepts, such as romanticness or overall quality. The upshot is that the results are tangible and useful.

I asked Yen Lee what UpTake thought about the top-down vs. the traditional bottom-up approach. Lee told me that he thinks the top-down approach is a great way to lead into the bottom-up Semantic Web. Lee thinks that top-down efforts to derive meaning from unstructured and semi-structured data, as well as efforts such as Yahoo!’s move to index semantic markup, will provide an incentive for content publishers to start using semantic markup on their data. Lee said that many of UpTake’s partners have already begun to ask how to make it easier for the site to read and understand their content.

Vertical search engines like UpTake might also provide the consumer face for the Semantic Web that can help sell it to consumers. Being able to search millions of reviews and opinions and have a computer understand how they relate to the type of vacation you want to take is the sort of palpable evidence needed to sell the Semantic Web idea. As these technologies get better, and data becomes more structured, then we might see NLP search engines like Powerset start to come up with better results than Google (though don’t think for a minute that Google would sit idly by and let that happen…).

What do you think of UpTake? Let us know int he comments below.

Read Full Post »

10 Semantic Apps to Watch

Written by Richard MacManus / November 29, 2007 12:30 AM / 39 Comments


One of the highlights of October’s Web 2.0 Summit in San Francisco was the emergence of ‘Semantic Apps’ as a force. Note that we’re not necessarily talking about the Semantic Web, which is the Tim Berners-Lee W3C led initiative that touts technologies like RDF, OWL and other standards for metadata. Semantic Apps may use those technologies, but not necessarily. This was a point made by the founder of one of the Semantic Apps listed below, Danny Hillis of Freebase (who is as much a tech legend as Berners-Lee).

The purpose of this post is to highlight 10 Semantic Apps. We’re not touting this as a ‘Top 10’, because there is no way to rank these apps at this point – many are still non-public apps, e.g. in private beta. It reflects the nascent status of this sector, even though people like Hillis and Spivack have been working on their apps for years now.

What is a Semantic App?

Firstly let’s define “Semantic App”. A key element is that the apps below all try to determine the meaning of text and other data, and then create connections for users. Another of the founders mentioned below, Nova Spivack of Twine, noted at the Summit that data portability and connectibility are keys to these new semantic apps – i.e. using the Web as platform.

In September Alex Iskold wrote a great primer on this topic, called Top-Down: A New Approach to the Semantic Web. In that post, Alex Iskold explained that there are two main approaches to Semantic Apps:

1) Bottom Up – involves embedding semantical annotations (meta-data) right into the data.
2) Top down – relies on analyzing existing information; the ultimate top-down solution would be a fully blown natural language processor, which is able to understand text like people do.

Now that we know what Semantic Apps are, let’s take a look at some of the current leading (or promising) products…

Freebase

Freebase aims to “open up the silos of data and the connections between them”, according to founder Danny Hillis at the Web 2.0 Summit. Freebase is a database that has all kinds of data in it and an API. Because it’s an open database, anyone can enter new data in Freebase. An example page in the Freebase db looks pretty similar to a Wikipedia page. When you enter new data, the app can make suggestions about content. The topics in Freebase are organized by type, and you can connect pages with links, semantic tagging. So in summary, Freebase is all about shared data and what you can do with it.

Powerset

Powerset (see our coverage here and here) is a natural language search engine. The system relies on semantic technologies that have only become available in the last few years. It can make “semantic connections”, which helps make the semantic database. The idea is that meaning and knowledge gets extracted automatically from Powerset. The product isn’t yet public, but it has been riding a wave of publicity over 2007.

Twine

Twine claims to be the first mainstream Semantic Web app, although it is still in private beta. See our in-depth review. Twine automatically learns about you and your interests as you populate it with content – a “Semantic Graph”. When you put in new data, Twine picks out and tags certain content with semantic tags – e.g. the name of a person. An important point is that Twine creates new semantic and rich data. But it’s not all user-generated. They’ve also done machine learning against Wikipedia to ‘learn’ about new concepts. And they will eventually tie into services like Freebase. At the Web 2.0 Summit, founder Nova Spivack compared Twine to Google, saying it is a “bottom-up, user generated crawl of the Web”.

AdaptiveBlue

AdaptiveBlue are makers of the Firefox plugin, BlueOrganizer. They also recently launched a new version of their SmartLinks product, which allows web site publishers to add semantically charged links to their site. SmartLinks are browser ‘in-page overlays’ (similar to popups) that add additional contextual information to certain types of links, including links to books, movies, music, stocks, and wine. AdaptiveBlue supports a large list of top web sites, automatically recognizing and augmenting links to those properties.

SmartLinks works by understanding specific types of information (in this case links) and wrapping them with additional data. SmartLinks takes unstructured information and turns it into structured information by understanding a basic item on the web and adding semantics to it.

[Disclosure: AdaptiveBlue founder and CEO Alex Iskold is a regular RWW writer]

Hakia

Hakia is one of the more promising Alt Search Engines around, with a focus on natural language processing methods to try and deliver ‘meaningful’ search results. Hakia attempts to analyze the concept of a search query, in particular by doing sentence analysis. Most other major search engines, including Google, analyze keywords. The company told us in a March interview that the future of search engines will go beyond keyword analysis – search engines will talk back to you and in effect become your search assistant. One point worth noting here is that, currently, Hakia has limited post-editing/human interaction for the editing of hakia Galleries, but the rest of the engine is 100% computer powered.

Hakia has two main technologies:

1) QDEX Infrastructure (which stands for Query Detection and Extraction) – this does the heavy lifting of analyzing search queries at a sentence level.

2) SemanticRank Algorithm – this is essentially the science they use, made up of ontological semantics that relate concepts to each other.

Talis

Talis is a 40-year old UK software company which has created a semantic web application platform. They are a bit different from the other 9 companies profiled here, as Talis has released a platform and not a single product. The Talis platform is kind of a mix between Web 2.0 and the Semantic Web, in that it enables developers to create apps that allow for sharing, remixing and re-using data. Talis believes that Open Data is a crucial component of the Web, yet there is also a need to license data in order to ensure its openness. Talis has developed its own content license, called the Talis Community License, and recently they funded some legal work around the Open Data Commons License.

According to Dr Paul Miller, Technology Evangelist at Talis, the company’s platform emphasizes “the importance of context, role, intention and attention in meaningfully tracking behaviour across the web.” To find out more about Talis, check out their regular podcasts – the most recent one features Kaila Colbin (an occassional AltSearchEngines correspondent) and Branton Kenton-Dau of VortexDNA.

UPDATE: Marshall Kirkpatrick published an interview with Dr Miller the day after this post. Check it out here.

TrueKnowledge

Venture funded UK semantic search engine TrueKnowledge unveiled a demo of its private beta earlier this month. It reminded Marshall Kirkpatrick of the still-unlaunched Powerset, but it’s also reminiscent of the very real Ask.com “smart answers”. TrueKnowledge combines natural language analysis, an internal knowledge base and external databases to offer immediate answers to various questions. Instead of just pointing you to web pages where the search engine believes it can find your answer, it will offer you an explicit answer and explain the reasoning patch by which that answer was arrived at. There’s also an interesting looking API at the center of the product. “Direct answers to humans and machine questions” is the company’s tagline.

Founder William Tunstall-Pedoe said he’s been working on the software for the past 10 years, really putting time into it since coming into initial funding in early 2005.

TripIt

Tripit is an app that manages your travel planning. Emre Sokullu reviewed it when it presented at TechCrunch40 in September. With TripIt, you forward incoming bookings to plans@tripit.com and the system manages the rest. Their patent pending “itinerator” technology is a baby step in the semantic web – it extracts useful infomation from these mails and makes a well structured and organized presentation of your travel plan. It pulls out information from Wikipedia for the places that you visit. It uses microformats – the iCal format, which is well integrated into GCalendar and other calendar software.

The company claimed at TC40 that “instead of dealing with 20 pages of planning, you just print out 3 pages and everything is done for you”. Their future plans include a recommendation engine which will tell you where to go and who to meet.

Clear Forest

ClearForest is one of the companies in the top-down camp. We profiled the product in December ’06 and at that point ClearForest was applying its core natural language processing technology to facilitate next generation semantic applications. In April 2007 the company was acquired by Reuters. The company has both a Web Service and a Firefox extension that leverages an API to deliver the end-user application.

The Firefox extension is called Gnosis and it enables you to “identify the people, companies, organizations, geographies and products on the page you are viewing.” With one click from the menu, a webpage you view via Gnosis is filled with various types of annotations. For example it recognizes Companies, Countries, Industry Terms, Organizations, People, Products and Technologies. Each word that Gnosis recognizes, gets colored according to the category.

Also, ClearForest’s Semantic Web Service offers a SOAP interface for analyzing text, documents and web pages.

Spock

Spock is a people search engine that got a lot of buzz when it launched. Alex Iskold went so far as to call it “one of the best vertical semantic search engines built so far.” According to Alex there are four things that makes their approach special:

  • The person-centric perspective of a query
  • Rich set of attributes that characterize people (geography, birthday, occupation, etc.)
  • Usage of tags as links or relationships between people
  • Self-correcting mechanism via user feedback loop

As a vertical engine, Spock knows important attributes that people have: name, gender, age, occupation and location just to name a few. Perhaps the most interesting aspect of Spock is its usage of tags – all frequent phrases that Spock extracts via its crawler become tags; and also users can add tags. So Spock leverages a combination of automated tags and people power for tagging.

Conclusion

What have we missed? 😉 Please use the comments to list other Semantic Apps you know of. It’s an exciting sector right now, because Semantic Web and Web 2.0 technologies alike are being used to create new semantic applications. One gets the feeling we’re only at the beginning of this trend.

Read Full Post »

Older Posts »

%d bloggers like this: