Feeds:
Posts
Comments

Archive for the ‘metaweb’ Category

Semantic Travel Search Engine UpTake Launches

Written by Josh Catone / May 14, 2008 6:00 AM / 8 Comments


According to a comScore study done last year, booking travel over the Internet has become something of a nightmare for people. It’s not that using any of the booking engines is difficult, it’s just that there is so much information out there that planning a vacation is overwhelming. According to the comScore study, the average online vacation plan comes together through 12 travel-related searches and visits to 22 different web sites over the course of 29 days. Semantic search startup UpTake (formerly Kango) aims to make that process easier.

UpTake is a vertical search engine that has assembled what it says is the largest database of US hotels and activities — over 400,000 of them — from more than 1,000 different travel sites. Using a top-down approach, UpTake looks at its database of over 20 million reviews, opinions, and descriptions of hotels and activities in the US and semantically extracts information about those destinations. You can think of it as Metacritic for the travel vertical, but rather than just arriving at an aggregate rating (which it does), UpTake also attempts to figure out some basic concepts about a hotel or activity based on what it learns from the information it reads. Things such as, is the hotel family friendly, would it be good for a romantic getaway, is it eco friendly, etc.

“UpTake matches a traveler with the most useful reviews, photos, etc. for the most relevant hotels and activities through attribute and sentiment analysis of reviews and other text, the analysis is guided by our travel ontology to extract weighted meta-tags,” said President Yen Lee, who was co-founder of the CitySearch San Francisco office and a former GM of Travel at Yahoo!

What UpTake isn’t, is a booking engine like Expedia, a meta price search engine like Kayak, or a travel community. UpTake is strictly about aggregation of reviews and semantic analysis and doesn’t actually do any booking. According to the company only 14% of travel searches start at a booking engine, which indicates that people are generally more interested in doing research about a destination before trying to locate the best prices. Many listings on the site have a “Check Rates” button, however, which gets hotel rates from third party partner sites — that’s actually how UpTake plans to make money.

The way UpTake works is by applying its specially created travel ontology, which contains concepts, relationships between those concepts, and rules about how they fit together, to the 20 million reviews in its database. The ontology allows UpTake to extract meaning from structured or semi-structured data by telling their search engine things like “a pool is a type of hotel amenity and kids like pools.” That means hotels with pools score some points when evaluating if a hotel is “kid friendly.” The ontology also knows, though, that a nude pool might be inappropriate for kids, and thus that would take points away when evaluating for kid friendliness.

A simplified example ontology is depicted below.

In addition to figuring out where destinations fit into vacation themes — like romantic getaway, family vacation, girls getaway, or outdoor — the site also does sentiment matching to determine if users liked a particular hotel or activity. The search engine looks for sentiment words such as “like,” “love,” “hate,” “cramped,” or “good view,” and knows what they mean and how they relate to the theme of the hotel and how people felt about it. It figures that information into the score it assigns each destination.

Conclusion

Yesterday, we looked at semantic, natural language processing search engine Powerset and found in some quick early testing that the results weren’t that much different than Google. “If Google remains ‘good enough,’ Powerset will have a hard time convincing people to switch,” we wrote. But while semantic search may feel rather clunky for the broader global web, it makes a lot of sense in specific verticals. The ontology is a lot more focused and the site also isn’t trying to answer specific questions, but rather attempting to semantically determine general concepts, such as romanticness or overall quality. The upshot is that the results are tangible and useful.

I asked Yen Lee what UpTake thought about the top-down vs. the traditional bottom-up approach. Lee told me that he thinks the top-down approach is a great way to lead into the bottom-up Semantic Web. Lee thinks that top-down efforts to derive meaning from unstructured and semi-structured data, as well as efforts such as Yahoo!’s move to index semantic markup, will provide an incentive for content publishers to start using semantic markup on their data. Lee said that many of UpTake’s partners have already begun to ask how to make it easier for the site to read and understand their content.

Vertical search engines like UpTake might also provide the consumer face for the Semantic Web that can help sell it to consumers. Being able to search millions of reviews and opinions and have a computer understand how they relate to the type of vacation you want to take is the sort of palpable evidence needed to sell the Semantic Web idea. As these technologies get better, and data becomes more structured, then we might see NLP search engines like Powerset start to come up with better results than Google (though don’t think for a minute that Google would sit idly by and let that happen…).

What do you think of UpTake? Let us know int he comments below.

Read Full Post »

Semantic Web: Making Advertising More Relevant to Consumers

Written by Lidija Davis / October 17, 2008 1:10 AM / 35 Comments


Amiad Solomon, CEO of Peer39, kicked off the Web 3.0 Conference & Expo in Santa Clara, CA on Thursday with a keynote discussing the Semantic Web and how it relates to advertising. He told the audience that this is one of the key business opportunities in the Web 3.0 era. “I believe the simplest definition of Web 3.0 is the monetization and commercialization of Web 2.0,” he said.

To fully appreciate how Web 3.0 can offer better advertising solutions, Solomon suggested that we start by analyzing the Web’s transformations since Tim Berners-Lee and Robert Cailliau wrote the official proposal for the World Wide Web in 1990.

The Evolution of the Web According to Solomon

Web 1.0 was basic connection via the Internet, where information flowed one way and was rarely updated. Web 1.0 ended in 2001 with the crash of the dot com era that some estimate cost in excess of $5 Trillion. The Web 1.0 lesson: Cash, not content, is king.

Web 2.0 marked the beginning of the ‘two sided Internet,’ where we started using the Internet to talk to one another. This interactivity generated billions of dollars in data – virtually for free. The Web 2.0 lesson: Sustainable revenues are possible.

Web 3.0 offers detailed data exchange to every point on the Internet, a ‘machine in the middle,’ with three main characteristics:

1. Smart internetworking

The Internet itself will get smarter and become a gathering tool to execute relatively complex tasks and analyze collective online behavior.

2. Seamless applications

Web 3.0 theories suggest that all applications will fit together; a continuation of open source where all applications will be able to communicate. APIs will read data from any platform and provide a single point of reference.

3. Distributed databases

Web 3.0 will need somewhere to store very complex and memory intensive information. It will require ontologies to establish relationships between information sources; search millions of nodes, and scan billions of data records at once.

How Does This Make Money?

“This is where the semantic Web comes in,” Solomon explained. “Businesses finally understand the Internet, and recognize that advertising is a good business model – if you can make it work.”

According to Solomon, there are two approaches to advertising currently being used; contextual advertising and behavioral targeting:

Contextual advertising systems scan website text for keywords that trigger the system to send predetermined ads. Used in search engine results page, contextual systems show ads based on users search words; unfortunately, these ads aren’t always relevant as words can have several meanings. While errors occasionally result in humor, and are good for a laugh, contextual ads show a serious weakness: companies investing in them are wasting advertising budgets, brand promotion and sentiment.

Behavioral targeting systems collect information on a person’s Web browsing history, usually by way of cookies. Given the European Union’s Directive 2002/58 on privacy and electronic communications, and pending US legislation restricting the use of cookies, behavioral targeting campaigns via cookies can no longer be seen as a valuable investment. Additionally, home computers are oftentimes shared, and if cookies are enabled, users get to see ads directed by other user’s cookies. Again, badly targeted advertising can be a nuisance for the user, and a waste of advertising dollars.

The Way of the Future: Semantic Advertising

Successful advertising means showing the right product to the right person at the right time. The semantic Web puts data into semantic formats on the fly, and targets ads based on the meaning of data with a high degree of accuracy.

This is good news for the user – no more embarrassing keyword results, no more Hooters ads on sites about feminism, and an end to annoying cookies.

Do you agree that the Semantic Web will bring even more effective advertising to the Web?

ReadWriteWeb is a media sponsor of the Web 3.0 Conference & Expo

Read Full Post »

Hakia Takes On Google With Semantic Technologies

Written by Richard MacManus / March 23, 2007 12:14 PM / 17 Comments


This week I spoke to Hakia founder and CEO Dr. Riza C. Berkan and COO Melek Pulatkonak. Hakia is one of the more promising Alt Search Engines around, with a focus on natural language processing methods to try and deliver ‘meaningful’ search results. Alex Iskold profiled Hakia for R/WW at the beginning of December and he concluded, after a number of search experiments, that Hakia was intriguing – but it was not a level to compete with Google yet. It is important to note that Hakia is a relatively early beta product and is still in development. But given the speed of Internet time, 3.5 months is probably a good time to check back and see how Hakia is progressing…

What is Hakia?

Riza and Melek firstly told me what makes Hakia different from Google. Hakia attempts to analyze the concept of a search query, in particular by doing sentence analysis. Most other major search engines, including Google, analyze keywords. Riza and Melek told me that the future of search engines will go beyond keyword analysis – search engines will talk back to you and in effect become your search assistant. 

One point worth noting here is that, currently, Hakia still has some human post-editing going on – so it isn’t 100% computer powered at this point.

Hakia has two main technologies:

1) QDEX Infrastructure (which stands for Query Detection and Extraction)  – this does the heavy lifting of analyzing search queries at a sentence level.

2) SemanticRank Algorithm – this is essentially the science they use, made up of ontological semantics that relate concepts to each other.

If you’re interested in the tech aspects, also check out hakia-Lab – which features their latest technology R&D.

How is Hakia different from Ask.com?

Hakia most reminds me of Ask.com, which uses more a natural language approach than the other big search engines (‘ask’ a question, get an answer) – and also Ask.com uses human editing too, as with Hakia. [I interviewed Ask.com back in November]. So I asked Riza and Melek what is the difference between Hakia and Ask.com?

Riza told me that Ask.com is an indexing search engine and it has no semantic analysis. Going one step below, he says to look at the basis of their results. Ask.com bolds keywords (i.e. it works at a keywords level), whereas Riza said that Hakia understands the sentence. He also said that Ask.com categories are not meaning-based – they are “canned or prefixed”. Hakia, he said, understands the semantic relationships.

Hakia vs Google

I next referred Riza and Melek to Read/WriteWeb’s interview with Matt Cutts of Google, in which Matt told me that Google is essentially already using semantic technologies, because the sheer amount of data that Google has “really does help us understand the meanings of words and synonyms”. Riza’s view on that is that Google works with popularity algorithms and so it can “never have enough statistical material to handle the Long Tail”. He says a search engine has to understand the language, in order to properly serve the Long Tail.

Moreover, Hakia’s view is that the vastness of data that Google has doesn’t solve the semantic problem – Riza and Melek think there needs to be that semantic connection present.

Their bigger claim though is that the big search companies are still thinking within an indexing framework (personalization etc). Hakia thinks that indexing has plateaued and that semantic technologies will take over for the next generation of search. They say that semantic technologies allow you to analyze content, which they think is ‘outside the box’ of what the big search companies are doing. Riza admitted that it was possible Google was investigating semantic technologies, behind closed doors. Nevertheless, he was adamant that the future is understanding info, not merely finding it – which he said is a very difficult problem to solve, but it’s Hakia’s mission.

Semantic web and Tim Berners-Lee

Throughout the interview, I noticed the word “semantic” was being used a lot – but their interpretation seemed to be different to that of Tim Berners-Lee, whose notion of a Semantic Web is generally what Web people think about when uttering the ‘S’ word. Riza confirmed that their concept of semantic technology is indeed different. He said that Tim Berners-Lee is banking on certain standards being accepted by web authors and writers – which Riza said is “such a big assumption to start this technology”. He said that it forces people to be linguists, which is not a common skill.

Furthermore, Riza told me that Berners-Lee’s Semantic Web is about “imposing a structure that assumes people will obey [and] follow”. He said that the “entire Semantic Web concept relies on utilizing semantic tagging, or labeling, which requires people to know it.” Hakia, he said, doesn’t depend on such structures. Hakia is all about analyzing the normal language of people – so a web author “doesn’t need to mess with that”.

Competitors

Apart from Google and the other big ‘indexing’ search engines, Hakia is competing against other semantic search engines like Powerset and hybrids like Wikia. Perhaps also Freebase – although Riza thinks the latter may be “old semantic web” (but he says there’s not enough information about it to say for sure).

Conclusion

Hakia plans to launch its version 1.0 (i.e. get out of beta) by the end of 2007. As of now my assessment is the same as Alex’s was in December – it’s a very promising, but as yet largely unproven, technology.

I also suspect that Google is much more advanced in search technology than Mountain View is letting on. We know that Google’s scale is a huge advantage, but their experiments with things like personalization and structured data (Google Base) show me that Google is also well aware of the need to implement next-generation search technologies. Also, as Riza noted during the interview, who knows what Google is doing behind closed doors.

Will semantic technologies and ‘sentence analysis’ be the next wave of search? It seems very plausible. So with a bit more development, Hakia could well become compelling to a mass market. Therefore how and when Google responds to Hakia will be something to watch carefully.

Read Full Post »


Report: Semantic Web Companies Are, or Will Soon Begin, Making Money

Written by Marshall Kirkpatrick / October 3, 2008 5:13 PM / 14 Comments


provostpic-1.jpgSemantic Web entrepreneur David Provost has published a report about the state of business in the Semantic Web and it’s a good read for anyone interested in the sector. It’s titled On the Cusp: A Global Review of the Semantic Web Industry. We also mentioned it in our post Where Are All The RDF-based Semantic Web Apps?.

The Semantic Web is a collection of technologies that makes the meaning of content online understandable by machines. After surveying 17 Semantic Web companies, Provost concludes that Semantic science is being productized, differentiated, invested in by mainstream players and increasingly sought after in the business world.

Provost aims to use real-world examples to articulate the value proposition of the Semantic Web in accessible, non-technical language. That there are enough examples available for him to do this is great. His conclusions don’t always seem as well supported by his evidence as he’d like – but the profiles he writes of 17 Semantic Web companies are very interesting to read.

What are these companies doing? Provost writes:

“..some companies are beginning to focus on specific uses of Semantic technology to create solutions in areas like knowledge management, risk management, content management and more. This is a key development in the Semantic Web industry because until fairly recently, most vendors simply sold development tools.”

 

The report surveys companies ranging from the innovative but unlaunched Anzo for Excel from Cambridge Semantics, to well-known big players like Down Jones Client Solutions and RWW sponsor Reuters Calais Initiative, to relatively unknown big players like the already very commercialized Expert System. 10 of the companies were from the US, 6 from Europe and 1 from South Korea.

semwebchart.jpgAbove: Chart from Provost’s report.We’ve been wanting to learn more about “under the radar” but commercialized semantic web companies ever since doing a briefing with Expert System a few months ago. We had never heard of the Italian company before, but they believe they already have they have a richer, deeper semantic index than anyone else online. They told us their database at the time contained 350k English words and 2.8m relationships between them. including geographic representations. They power Microsoft’s spell checker and the Natural Language Processing (NLP) in the Blackberry. They also sell NLP software to the US military and Department of Homeland Security, which didn’t seem like anything to brag about to us but presumably makes up a significant part of the $12 million+ in revenue they told Provost they made last year.

And some people say the Semantic Web only exists inside the laboratories of Web 3.0 eggheads!

Shortcomings of the Report

Provost writes that “the vendors [in] this report have all the appearances of thriving, emerging technology companies and they have shown their readiness to cross borders, continents, and oceans to reach customers.” You’d think they turned water into wine. Those are strong words for a study in which only 4 of 17 companies were willing to report their revenue and several hadn’t launched products yet.

The logic here is sometimes pretty amazing.

The above examples [there were two discussed – RWW] are just a brief sampling of the commercial success that the Semantic Web has been experiencing. In broad terms, it’s easy to point out the longevity of many companies in this industry and use that as a proxy for commercial success [wow – RWW]. With more time (and space in this report), additional examples could be described but the most interesting prospect pertains to what the industry landscape will look like in twelve months. [hmmm…-RWW]

 

In fact, while Provost has glowingly positive things to about all the companies he surveyed, the absence of engagement with any of their shortcomings makes the report read more like marketing material than any objective take on what’s supposed to be world-changing technology.

This is a Fun Read

The fact is, though, that Provost writes a great introduction to many companies working to sell software in a field still too widely believed to be ephemeral. The stories of each of the 17 companies profiled are fun to read and many of Provost’s points of analysis are both intuitive and thought provoking.

He says the sector is “on the cusp” of major penetration into existing markets currently served by non-semantic software. Provost argues that the Semantic Web struggles to explain itself because the World Wide Web is so intensely visual and semantics are not. He says that reselling business partners in specific distribution channels are combining their domain knowledge with the science of the software developers to bring these tools to market. He tells a great, if unattributed, story about what Linked Data could mean to the banking industry.

We hadn’t heard of several of the companies profiled in the report, and a handful of them had never been mentioned by the 34 semantic web specialist blogs we track, either.

There’s something here for everyone. You can read the full report here.

Read Full Post »


The Semantic Desktop? SDS Brings Semantics To Excel

Written by Sarah Perez / August 13, 2008 6:30 AM / 6 Comments


When you hear the word “semantic” you likely think of the semantic web – the supposed next iteration of the World Wide Web that features structured data and specific protocols that aim to bring about an “intelligent” web. But the concept of semantics doesn’t necessarily apply just to the web – it can apply to other things as well, like your desktop…or even your Excel spreadsheets, according to Ian Goldsmid, founder of Semantic Business Intelligence, whose new app, SDS, brings a semantic system to spreadsheets.

Semantic Spreadsheets

The problem with spreadsheets that their system is trying to address has to do with those who need to derive data from multiple spreadsheets (two or more). Although it’s easy enough to perform sorts, build macros, and create formulas within one spreadsheet, when needing to compare values in multiple spreadsheets the process becomes more difficult.

The company’s app, The Semantic Discovery System for Excel, or just SDS for short, will look for similar columns or rows between the sheets and then “semantically” connects them. They don’t appear to just be throwing that term around either – the app uses the same W3C Semantic Web technologies (RDF, OWL, SPARQL) to help you capture “meaning, intelligence, and knowledge” from the data saved in your spreadsheets.

Do We Need Semantic Desktop Apps?

Does SDS solve a business problem that is not yet being addressed through current technologies? In my experience, the short answer to this question is “no.” (But wait, there’s more…)

Typically, when a business has need of comparing and analyzing large amounts of data, the solution is to turn to a database product that can then be queried and from which custom reports can be pulled. And a business doesn’t need to spend a lot of money on a robust solution to do so – even a smaller business can create a database by using inexpensive desktop software.

However, the difference between using a database technology and “semantically connecting” some spreadsheets comes down to for whom this product is being built. In the past, databases and other business intelligence apps were built as if the creators knew that the only person using them would be an I.T. guy or gal. SDS, instead, aims to satisfy the needs of the non-technical end user.

Is this another example of tech populism at work? It certainly looks like it. Yet, in this case their market is small – a non-technical user who’s also a power user with Excel? There’s usually some overlap there. Not to mention, by the time you’ve achieved “power user” status, you’ve often also figured out how to do more complicated things in Excel…like, say, formulas that work across spreadsheets, for example – the very pain points this app is trying to address.

Still, it’s an interesting concept to think of taking the semantic web capabilities and integrating them into everyday programs to add a layer of intelligence to these programs as well. Done correctly, it could improve the capabilities of our favorite software apps without making the programs overly complex, which is what typically happens when you add more features.

What do you think? Is the Semantic Desktop (that is, semantically-enabled desktop apps) right around the corner? Or is this product and those like it too niche to find an audience? Let us know what you think in the comments.

Read Full Post »

10 Semantic Apps to Watch

Written by Richard MacManus / November 29, 2007 12:30 AM / 39 Comments


One of the highlights of October’s Web 2.0 Summit in San Francisco was the emergence of ‘Semantic Apps’ as a force. Note that we’re not necessarily talking about the Semantic Web, which is the Tim Berners-Lee W3C led initiative that touts technologies like RDF, OWL and other standards for metadata. Semantic Apps may use those technologies, but not necessarily. This was a point made by the founder of one of the Semantic Apps listed below, Danny Hillis of Freebase (who is as much a tech legend as Berners-Lee).

The purpose of this post is to highlight 10 Semantic Apps. We’re not touting this as a ‘Top 10’, because there is no way to rank these apps at this point – many are still non-public apps, e.g. in private beta. It reflects the nascent status of this sector, even though people like Hillis and Spivack have been working on their apps for years now.

What is a Semantic App?

Firstly let’s define “Semantic App”. A key element is that the apps below all try to determine the meaning of text and other data, and then create connections for users. Another of the founders mentioned below, Nova Spivack of Twine, noted at the Summit that data portability and connectibility are keys to these new semantic apps – i.e. using the Web as platform.

In September Alex Iskold wrote a great primer on this topic, called Top-Down: A New Approach to the Semantic Web. In that post, Alex Iskold explained that there are two main approaches to Semantic Apps:

1) Bottom Up – involves embedding semantical annotations (meta-data) right into the data.
2) Top down – relies on analyzing existing information; the ultimate top-down solution would be a fully blown natural language processor, which is able to understand text like people do.

Now that we know what Semantic Apps are, let’s take a look at some of the current leading (or promising) products…

Freebase

Freebase aims to “open up the silos of data and the connections between them”, according to founder Danny Hillis at the Web 2.0 Summit. Freebase is a database that has all kinds of data in it and an API. Because it’s an open database, anyone can enter new data in Freebase. An example page in the Freebase db looks pretty similar to a Wikipedia page. When you enter new data, the app can make suggestions about content. The topics in Freebase are organized by type, and you can connect pages with links, semantic tagging. So in summary, Freebase is all about shared data and what you can do with it.

Powerset

Powerset (see our coverage here and here) is a natural language search engine. The system relies on semantic technologies that have only become available in the last few years. It can make “semantic connections”, which helps make the semantic database. The idea is that meaning and knowledge gets extracted automatically from Powerset. The product isn’t yet public, but it has been riding a wave of publicity over 2007.

Twine

Twine claims to be the first mainstream Semantic Web app, although it is still in private beta. See our in-depth review. Twine automatically learns about you and your interests as you populate it with content – a “Semantic Graph”. When you put in new data, Twine picks out and tags certain content with semantic tags – e.g. the name of a person. An important point is that Twine creates new semantic and rich data. But it’s not all user-generated. They’ve also done machine learning against Wikipedia to ‘learn’ about new concepts. And they will eventually tie into services like Freebase. At the Web 2.0 Summit, founder Nova Spivack compared Twine to Google, saying it is a “bottom-up, user generated crawl of the Web”.

AdaptiveBlue

AdaptiveBlue are makers of the Firefox plugin, BlueOrganizer. They also recently launched a new version of their SmartLinks product, which allows web site publishers to add semantically charged links to their site. SmartLinks are browser ‘in-page overlays’ (similar to popups) that add additional contextual information to certain types of links, including links to books, movies, music, stocks, and wine. AdaptiveBlue supports a large list of top web sites, automatically recognizing and augmenting links to those properties.

SmartLinks works by understanding specific types of information (in this case links) and wrapping them with additional data. SmartLinks takes unstructured information and turns it into structured information by understanding a basic item on the web and adding semantics to it.

[Disclosure: AdaptiveBlue founder and CEO Alex Iskold is a regular RWW writer]

Hakia

Hakia is one of the more promising Alt Search Engines around, with a focus on natural language processing methods to try and deliver ‘meaningful’ search results. Hakia attempts to analyze the concept of a search query, in particular by doing sentence analysis. Most other major search engines, including Google, analyze keywords. The company told us in a March interview that the future of search engines will go beyond keyword analysis – search engines will talk back to you and in effect become your search assistant. One point worth noting here is that, currently, Hakia has limited post-editing/human interaction for the editing of hakia Galleries, but the rest of the engine is 100% computer powered.

Hakia has two main technologies:

1) QDEX Infrastructure (which stands for Query Detection and Extraction) – this does the heavy lifting of analyzing search queries at a sentence level.

2) SemanticRank Algorithm – this is essentially the science they use, made up of ontological semantics that relate concepts to each other.

Talis

Talis is a 40-year old UK software company which has created a semantic web application platform. They are a bit different from the other 9 companies profiled here, as Talis has released a platform and not a single product. The Talis platform is kind of a mix between Web 2.0 and the Semantic Web, in that it enables developers to create apps that allow for sharing, remixing and re-using data. Talis believes that Open Data is a crucial component of the Web, yet there is also a need to license data in order to ensure its openness. Talis has developed its own content license, called the Talis Community License, and recently they funded some legal work around the Open Data Commons License.

According to Dr Paul Miller, Technology Evangelist at Talis, the company’s platform emphasizes “the importance of context, role, intention and attention in meaningfully tracking behaviour across the web.” To find out more about Talis, check out their regular podcasts – the most recent one features Kaila Colbin (an occassional AltSearchEngines correspondent) and Branton Kenton-Dau of VortexDNA.

UPDATE: Marshall Kirkpatrick published an interview with Dr Miller the day after this post. Check it out here.

TrueKnowledge

Venture funded UK semantic search engine TrueKnowledge unveiled a demo of its private beta earlier this month. It reminded Marshall Kirkpatrick of the still-unlaunched Powerset, but it’s also reminiscent of the very real Ask.com “smart answers”. TrueKnowledge combines natural language analysis, an internal knowledge base and external databases to offer immediate answers to various questions. Instead of just pointing you to web pages where the search engine believes it can find your answer, it will offer you an explicit answer and explain the reasoning patch by which that answer was arrived at. There’s also an interesting looking API at the center of the product. “Direct answers to humans and machine questions” is the company’s tagline.

Founder William Tunstall-Pedoe said he’s been working on the software for the past 10 years, really putting time into it since coming into initial funding in early 2005.

TripIt

Tripit is an app that manages your travel planning. Emre Sokullu reviewed it when it presented at TechCrunch40 in September. With TripIt, you forward incoming bookings to plans@tripit.com and the system manages the rest. Their patent pending “itinerator” technology is a baby step in the semantic web – it extracts useful infomation from these mails and makes a well structured and organized presentation of your travel plan. It pulls out information from Wikipedia for the places that you visit. It uses microformats – the iCal format, which is well integrated into GCalendar and other calendar software.

The company claimed at TC40 that “instead of dealing with 20 pages of planning, you just print out 3 pages and everything is done for you”. Their future plans include a recommendation engine which will tell you where to go and who to meet.

Clear Forest

ClearForest is one of the companies in the top-down camp. We profiled the product in December ’06 and at that point ClearForest was applying its core natural language processing technology to facilitate next generation semantic applications. In April 2007 the company was acquired by Reuters. The company has both a Web Service and a Firefox extension that leverages an API to deliver the end-user application.

The Firefox extension is called Gnosis and it enables you to “identify the people, companies, organizations, geographies and products on the page you are viewing.” With one click from the menu, a webpage you view via Gnosis is filled with various types of annotations. For example it recognizes Companies, Countries, Industry Terms, Organizations, People, Products and Technologies. Each word that Gnosis recognizes, gets colored according to the category.

Also, ClearForest’s Semantic Web Service offers a SOAP interface for analyzing text, documents and web pages.

Spock

Spock is a people search engine that got a lot of buzz when it launched. Alex Iskold went so far as to call it “one of the best vertical semantic search engines built so far.” According to Alex there are four things that makes their approach special:

  • The person-centric perspective of a query
  • Rich set of attributes that characterize people (geography, birthday, occupation, etc.)
  • Usage of tags as links or relationships between people
  • Self-correcting mechanism via user feedback loop

As a vertical engine, Spock knows important attributes that people have: name, gender, age, occupation and location just to name a few. Perhaps the most interesting aspect of Spock is its usage of tags – all frequent phrases that Spock extracts via its crawler become tags; and also users can add tags. So Spock leverages a combination of automated tags and people power for tagging.

Conclusion

What have we missed? 😉 Please use the comments to list other Semantic Apps you know of. It’s an exciting sector right now, because Semantic Web and Web 2.0 technologies alike are being used to create new semantic applications. One gets the feeling we’re only at the beginning of this trend.

Read Full Post »

10 More Semantic Apps to Watch

Written by Richard MacManus / November 20, 2008 10:00 AM / 16 Comments


In November 2007, we listed 10 Semantic apps to watch and yesterday we published an update on what each had achieved over the past year. All of them are still alive and well – a couple are thriving, some are experimenting and a few are still finding their way.

Now we’re going to list 10 more Semantic apps to watch. These are all apps that have gotten onto our radar over 2008. We’ve reviewed all but one of them, so click through to the individual reviews for more detail. It should go without saying, but this is by no means an exhaustive list – so if we haven’t mentioned your favorite, please add it in the comments.

BooRah

boorah_logo_sep08.pngBooRah is a restaurant review site that we first reviewed earlier this year. One of BooRah’s most interesting aspects is that it uses semantic analysis and natural language processing to aggregate reviews from food blogs. Because of this, BooRah can recognize praise and criticism in these reviews and then rates restaurants accordingly. BooRah also gathers reviews from Citysearch, Tripadvisor and other large review sites.

BooRah also announced last month the availability of an API that will allow other web sites and businesses to offer online reviews and ratings from BooRah to their customers. The API will surface most of BooRah’s data about a given restaurant, including ratings, menus, discounts, and coupons.

Swotti

Swotti is a semantic search engine that aggregates opinions about products to help you make purchasing decisions. We reviewed the product back in March. Swotti aggregates opinions about products from product review sites, forums and discussion boards, web sites and blogs, and then categorizes those reviews as to what feature or aspect of the product is being reviewed, tagging it accordingly, and then rating the review on as positive or negative.

Dapper MashupAds

Earlier this month we wrote about the recent improvement in Dapper MashupAds, a product we first spotted over a year ago. The idea is that publishers can tell Dapper: this is the place on my web page where the title of a movie will appear, now serve up a banner ad that’s related to whatever movie this page happens to be about. That could be movies, books, travel destinations – anything. We remarked that the UI for this has grown much more sophisticated in the past year.

How this works: in the back end, Dapper will be analyzing the fields that publishers identify and will apply a layer of semantic classification on top of them. The company believes that its new ad network will provide monetary incentive for publishers to have their websites marked up semantically. Dapper also has a product called Semantify, for SEO – see our review of that.

For more on Semantic advertising, see our write-up of a panel on this topic from the Web 3.0 Conference.

Inform.com

Inform.com analyzes content from online publishers and inserts links from a publisher’s own content archives, affiliated sites, or the web at large, to augment content being published. We reviewed it in January, when at the time the company had more than 100 clients – including CNN.com, WashingtonPost.com and the Economist.

Inform says its technology determines the semantic meaning of key words in millions of news stories around the web every day in order to recommend related content. The theory is that by automating the process of relevant link discovery and inclusion, Inform can easily add substantial value to a publisher’s content. Inform also builds out automatic topic pages, something you can see around WashingtonPost and CNN.com.

Siri

siri_coming_soon_logo.pngWe have met our share of secretive startups over the years, but few have been as secretive about their plans as Siri, which was founded in December 2007 and did not even have an official name until October this year. Siri was spun out of SRI International and its core technology is based on the highly ambitious CALO artificial intelligence project.

In our October post on Siri, we discovered that Siri is working on a “personalized assistant that learns.” We expect Siri to have a strong information management aspect, combined with some novel interface ideas. Based on our discussion with founders Dag Kittlaus and Adam Cheyer in October, we think that there will be a strong mobile aspect to Siri’s product and at least some emphasis on location awareness. Siri plans to launch in the first half of 2009.

Evri

evri-logo.pngEvri is a Paul Allen (of Microsoft fame) backed semantic search engine that launched into a limited beta in June. Evri is a search engine, though it adds a very sophisticated semantic layer on top of its results that emphasizes the relationships between different search terms. It especially prides itself for having developed a system that can distinguish between grammatical objects such subjects, verbs, and objects to create these connections. You can check out a tour of Evri here.

UpTake

Semantic search startup UpTake (formerly Kango) aims to make the process of booking travel online easier. In our review in May, we explained that UpTake is a vertical search engine that has assembled what it says is the largest database of US hotels and activities – over 400,000 of them – from more than 1,000 different travel sites. Using a top-down approach, UpTake looks at its database of over 20 million reviews, opinions, and descriptions of hotels and activities in the US and semantically extracts information about those destinations.

Imindi

Imindi is essentially a mind mapping tool, although it markets itself as a “Thought Engine”. Imindi was recommended to us in the comments to our previous post by Yihong Ding, who called it “an untraditional Semantic Web service”. Yihong said that traditionally Semantic Web services employ machines to understand humans, however Imindi’s approach is to encourage humans to better understand each other via machines.

Imindi has met with a fair amount of skepticism so far – and indeed it appears to be reaching big with its AI associations. However we think it’s worth watching, if for no other reason than to see if it can live up to the description on its About page: “By capturing the free form associations of user’s logic and intuition, IMINDI is building a global mind index which is an entirely new resource for building collective intelligence and leveraging human creativity and subjectivity on the web.”

See also: Thinkbase: Mapping the World’s Brain

Juice

JuiceWe’ve all been there. You started reading something on the Web, saw something interesting in the article, searched for it, wound up somewhere else, and after about 12 hops you’ve forgotten exactly what it was you were looking for. If only there were some way to select that topic midstream and have the information automagically appear for you, without disrupting your workflow or sending you traipsing off into the wilds of the Web.

If that sounds familiar, you may need a shot of Juice, a new Firefox 3 add-in currently in public beta from Linkool Labs, that makes researching Web content as easy as click-and-drag. In our review of Juice, we concluded that it avoids some of the more traditional stumbling blocks of Semantic apps by taking a very top-down approach focused on a distinct data set.

Faviki

Faviki is a new social bookmarking tool which we reviewed back in May. It offers something that services like Ma.gnolia, del.icio.us and Diigo do not – semantic tagging capabilities. What this means is that instead of having users haphazardly entering in tags to describe the links they save, Faviki will suggest tags to be used instead. However, unlike other services, Faviki’s suggestions don’t just come from a community of users and their tagging history, but from structured information extracted straight out of the Wikipedia database.

Because Faviki uses structured tagging, there is more that can be learned about a particular tag, its properties, and its connections to other tags. The system will automatically know what tags belong together and how they relate to others.

Conclusion

The Semantic Web continues to inch closer to reality, by being used in products such as BooRah, Inform.com and Juice. Let us know your thoughts on the above 10 products, and of course any that we missed this time round.

Read Full Post »

« Newer Posts - Older Posts »

%d bloggers like this: