Feeds:
Posts
Comments

Posts Tagged ‘way’

Top-Down: A New Approach to the Semantic Web

Written by Alex Iskold / September 20, 2007 4:22 PM / 17 Comments


Earlier this week we wrote about the classic approach to the semantic web and the difficulties with that approach. While the original vision of the layer on top of the current web, which annotates information in a way that is “understandable” by computers, is compelling; there are technical, scientific and business issues that have been difficult to address.

One of the technical difficulties that we outlined was the bottom-up nature of the classic semantic web approach. Specifically, each web site needs to annotate information in RDF, OWL, etc. in order for computers to be able to “understand” it.

As things stand today, there is little reason for web site owners to do that. The tools that would leverage the annotated information do not exist and there has not been any clearly articulated business and consumer value. Which means that there is no incentive for the sites to invest money into being compatible with the semantic web of the future.

But there are alternative approaches. We will argue that a more pragmatic, top-down approach to the semantic web not only makes sense, but is already well on the way toward becoming a reality. Many companies have been leveraging existing, unstructured information to build vertical, semantic services. Unlike the original vision, which is rather academic, these emergent solutions are driven by business and market potential.

In this post, we will look at the solution that we call the top-down approach to the semantic web, because instead of requiring developers to change or augment the web, this approach leverages and builds on top of current web as-is.

Why Do We Need The Semantic Web?

The complexity of original vision of the semantic web and lack of clear consumer benefits makes the whole project unrealistic. The simple question: Why do we need computers to understand semantics? remains largely unanswered.

While some of us think that building AI is cool, the majority of people think that AI is a little bit silly, or perhaps even unsettling. And they are right. AI for the sake of AI does not make any sense. If we are talking about building intelligent machines, and if we need to spend money and energy annotating all the information in the world for them, then there needs to be a very clear benefit.

Stated the way it is, the semantic web becomes a vision in search of a reason. What if the problem was restated from the consumer point of view? Here is what we are really looking forward to with the semantic web:

 

  • Spend less time searching
  • Spend less time looking at things that do not matter
  • Spend less time explaining what we want to computers

 

A consumer focus and clear benefit for businesses needs to be there in order for the semantic web vision to be embraced by the marketplace.

What If The Problem Is Not That Hard?

If all we are trying to do is to help people improve their online experiences, perhaps the full “understanding” of semantics by computers is not even necessary. The best online search tool today is Google, which is an algorithm based, essentially, on statistical frequency analysis and not semantics. Solutions that attempt to improve Google by focusing on generalized semantics have so far not been finding it easy to do so.

The truth is that the understanding of natural language by computers is a really hard problem. We have the language ingrained in our genes. We learn language as we grow up. We learn things iteratively. We have the chance to clarify things when we do not understand them. None of this is easily replicated with computers.

But what if it is not even necessary to build the first generation of semantic tools? What if instead of trying to teach computers natural language, we hard-wired into computers the concepts of everyday things like books, music, movies, restaurants, stocks and even people. Would that help us be more productive and find things faster?

Simple Semantics: Nouns And Verbs

When we think about a book we think about handful of things – title and author, maybe genre and the year it was published. Typically, though, we could care less about the publisher, edition and number of pages. Similarly, recipes provoke thoughts about cuisine and ingredients, while movies make us think about the plot, director, and stars.

When we think of people, we also think about a handful of things: birthday, where do they live, how we’re related to them, etc. The profiles found on popular social networks are great examples of simple semantics based around people:

Books, people, recipes, movies are all examples of nouns. The things that we do on the web around these nouns, such as looking up similar books, finding more people who work for the same company, getting more recipes from the same chef and looking up pictures of movie stars, are similar to verbs in everyday language. These are contextual actuals that are based on the understanding of the noun.

What if semantic applications hard-wired understanding and recognition of the nouns and then also hard-wired the verbs that make sense? We are actually well on our way doing just that. Vertical search engines like Spock, Retrevo, ZoomInfo, the page annotating technology from Clear Forrest, Dapper, and the Map+ extension for Firefox are just a few examples of top-down semantic web services.

The Top-Down Semantic Web Service

The essence of a top-down semantic web service is simple – leverage existing web information, apply specific, vertical semantic knowledge and then redeliver the results via a consumer-centric application. Consider the vertical search engine Spock, which scans the web for information about people. It knows how to recognize names in HTML pages and it also looks for common information about people that all people have – birthdays, locations, marital status, etc. In addition, Spock “understands” that people relate to each other. If you look up Bush, then Clinton will show up as a predecessor. If you look up Steve Jobs, then Bill Gates will come up as a rival.

In other words, Spock takes simple, everyday semantics about people and applies it to the information that already exists online. The result? A unique and useful vertical search engine for people. Further, note that Spock does not require the information to be re-annotated in RDF and OWL. Instead, the company builds adapters that use heuristics to get the data. The engine does not actually have full understanding of semantics about people, however. For example, it does not know that people like different kinds of ice cream, but it doesn’t need to. The point is that by focusing on a simple semantics, Spock is able to deliver a useful end-user service.

Another, much simpler, example is the Map+ add-on for Firefox. This application recognizes addresses and provides a map popup using Yahoo! Maps. It is the simplicity of this application that precisely conveys the power of simple semantics. The add-on “knows” what addresses look like. Sure, sometimes it makes mistakes, but most of the time it tags addresses in online documents properly. So it leverages existing information and then provides direct end user utility by meshing it up with Yahoo! Maps.

The Challenges Facing The Top-Down Approach

Despite being effective, the somewhat simplistic top-down approach has several problems. First, it is not really the semantic web as it is defined, instead its a group of semantic web services and applications that create utility by leveraging simple semantics. So the proponents of the classic approach would protest and they would be right. Another issue is that these services do not always get semantics right because of ambiguities. Because the recognition is algorithmic and not based on an underlying RDF representation, it is not perfect.

It seems to me that it is better to have simpler solutions that work 90% of the time than complex ones that never arrive. The key questions here are: How exactly are mistakes handled? And, is there a way for the user to correct the problem? The answers will be left up to the individual application. In life we are used to other people being unpredictable, but with computers, at least in theory, we expect things to work the same every time.

Yet another issue is that these simple solutions may not scale well. If the underlying unstructured data changes can the algorithms be changed quickly enough? This is always an issue with things that sit on top of other things without an API. Of course, if more web sites had APIs, as we have previously suggested, the top-down semantic web would be much easier and more certain.

Conclusion

While the original vision of the semantic web is grandiose and inspiring in practice it has been difficult to achieve because of the engineering, scientific and business challenges. The lack of specific and simple consumer focus makes it mostly an academic exercise. In the mean time, existing data is being leveraged by applying simple heuristics and making assumptions about particular verticals. What we have dubbed top-down semantic web applications have been appearing online and improving end user experiences by leveraging semantics to deliver real, tangible services.

Will the bottom-up semantic web ever happen? Possibly. But, at the moment the precise path to get there is not quite clear. In the mean time, we can all enjoy better online experience and get to where we need to go faster thanks to simple top-down semantic web services.

Advertisements

Read Full Post »

Yahoo to Enable Custom Semantic Search Engines

Written by Marshall Kirkpatrick / February 11, 2009 9:14 AM / 2 Comments


Yahoo is bringing together two of its most interesting projects today, Yahoo BOSS (Build Your Own Search Service) and SearchMonkey, its semantic indexing and search result enhancement service. There were a number of different parts of the announcement – but the core of the story is simple.

Developers will now be able to build their own search engines using the Yahoo! index and search processing infrastructure via BOSS and include the semantic markup added to pages in both results parsing and the display of those results. There’s considerable potential here for some really dazzling results.

We wrote about the genesis of Search Monkey here this Spring, it’s an incredibly ambitious project. The end result of it is rich search results, where additional dynamic data from marked up fields can also be displayed on the search results page itself. So searching for a movie will show not just web pages associated with that movie, but additional details from those pages, like movie ratings, stars, etc. There’s all kinds of possibilities for all kinds of data.

Is anyone using Yahoo! BOSS yet? Anyone who will be able to leverage Search Monkey for a better experience right away? Yahoo is encouraging developers to tag their projects bossmashup in Delicious. As you can see for yourself, there are a number of interesting proofs of concept there but not a whole lot of products. Of the products that are there, very few seem terribly compelling to us so far.

We must admit that the most compelling BOSS implementation so far is over at the site of our competitors TechCrunch. Their new blog network search implementation of BOSS is beautiful – you can see easily, for example, that TechCrunch network blogs have used the word ReadWriteWeb 7 times in the last 6 months. (In case you were wondering.)

Speaking of TechCrunch, that site’s Mark Hendrickson covered the Yahoo BOSS/Search Monkey announcement today as well, and having worked closely on the implementation there he’s got an interesting perspective on it. He points out that the new pricing model, free up to 10,000 queries a day, will likely only impact a handful of big sites – not BOSS add-ons like TechCrunch search or smaller projects.

The other interesting part of the announcement is that BOSS developers will now be allowed to use 3rd party ads on their pages leveraging BOSS – not just Yahoo adds. That’s hopeful.

Can Yahoo do it? Can these two projects brought together lead to awesome search mashups all over the web? We’ve had very high hopes in the past. Now the proof will be in the pudding.

Read Full Post »


Report: Semantic Web Companies Are, or Will Soon Begin, Making Money

Written by Marshall Kirkpatrick / October 3, 2008 5:13 PM / 14 Comments


provostpic-1.jpgSemantic Web entrepreneur David Provost has published a report about the state of business in the Semantic Web and it’s a good read for anyone interested in the sector. It’s titled On the Cusp: A Global Review of the Semantic Web Industry. We also mentioned it in our post Where Are All The RDF-based Semantic Web Apps?.

The Semantic Web is a collection of technologies that makes the meaning of content online understandable by machines. After surveying 17 Semantic Web companies, Provost concludes that Semantic science is being productized, differentiated, invested in by mainstream players and increasingly sought after in the business world.

Provost aims to use real-world examples to articulate the value proposition of the Semantic Web in accessible, non-technical language. That there are enough examples available for him to do this is great. His conclusions don’t always seem as well supported by his evidence as he’d like – but the profiles he writes of 17 Semantic Web companies are very interesting to read.

What are these companies doing? Provost writes:

“..some companies are beginning to focus on specific uses of Semantic technology to create solutions in areas like knowledge management, risk management, content management and more. This is a key development in the Semantic Web industry because until fairly recently, most vendors simply sold development tools.”

 

The report surveys companies ranging from the innovative but unlaunched Anzo for Excel from Cambridge Semantics, to well-known big players like Down Jones Client Solutions and RWW sponsor Reuters Calais Initiative, to relatively unknown big players like the already very commercialized Expert System. 10 of the companies were from the US, 6 from Europe and 1 from South Korea.

semwebchart.jpgAbove: Chart from Provost’s report.We’ve been wanting to learn more about “under the radar” but commercialized semantic web companies ever since doing a briefing with Expert System a few months ago. We had never heard of the Italian company before, but they believe they already have they have a richer, deeper semantic index than anyone else online. They told us their database at the time contained 350k English words and 2.8m relationships between them. including geographic representations. They power Microsoft’s spell checker and the Natural Language Processing (NLP) in the Blackberry. They also sell NLP software to the US military and Department of Homeland Security, which didn’t seem like anything to brag about to us but presumably makes up a significant part of the $12 million+ in revenue they told Provost they made last year.

And some people say the Semantic Web only exists inside the laboratories of Web 3.0 eggheads!

Shortcomings of the Report

Provost writes that “the vendors [in] this report have all the appearances of thriving, emerging technology companies and they have shown their readiness to cross borders, continents, and oceans to reach customers.” You’d think they turned water into wine. Those are strong words for a study in which only 4 of 17 companies were willing to report their revenue and several hadn’t launched products yet.

The logic here is sometimes pretty amazing.

The above examples [there were two discussed – RWW] are just a brief sampling of the commercial success that the Semantic Web has been experiencing. In broad terms, it’s easy to point out the longevity of many companies in this industry and use that as a proxy for commercial success [wow – RWW]. With more time (and space in this report), additional examples could be described but the most interesting prospect pertains to what the industry landscape will look like in twelve months. [hmmm…-RWW]

 

In fact, while Provost has glowingly positive things to about all the companies he surveyed, the absence of engagement with any of their shortcomings makes the report read more like marketing material than any objective take on what’s supposed to be world-changing technology.

This is a Fun Read

The fact is, though, that Provost writes a great introduction to many companies working to sell software in a field still too widely believed to be ephemeral. The stories of each of the 17 companies profiled are fun to read and many of Provost’s points of analysis are both intuitive and thought provoking.

He says the sector is “on the cusp” of major penetration into existing markets currently served by non-semantic software. Provost argues that the Semantic Web struggles to explain itself because the World Wide Web is so intensely visual and semantics are not. He says that reselling business partners in specific distribution channels are combining their domain knowledge with the science of the software developers to bring these tools to market. He tells a great, if unattributed, story about what Linked Data could mean to the banking industry.

We hadn’t heard of several of the companies profiled in the report, and a handful of them had never been mentioned by the 34 semantic web specialist blogs we track, either.

There’s something here for everyone. You can read the full report here.

Read Full Post »

Google: “We’re Not Doing a Good Job with Structured Data”

Written by Sarah Perez / February 2, 2009 7:32 AM / 9 Comments


During a talk at the New England Database Day conference at the Massachusetts Institute of Technology, Google’s Alon Halevy admitted that the search giant has “not been doing a good job” presenting the structured data found on the web to its users. By “structured data,” Halevy was referring to the databases of the “deep web” – those internet resources that sit behind forms and site-specific search boxes, unable to be indexed through passive means.

Google’s Deep Web Search

Halevy, who heads the “Deep Web” search initiative at Google, described the “Shallow Web” as containing about 5 million web pages while the “Deep Web” is estimated to be 500 times the size. This hidden web is currently being indexed in part by Google’s automated systems that submit queries to various databases, retrieving the content found for indexing. In addition to that aspect of the Deep Web – dubbed “vertical searching” – Halevy also referenced two other types of Deep Web Search: semantic search and product search.

Google wants to also be able to retrieve the data found in structured tables on the web, said Halevy, citing a table on a page listing the U.S. presidents as an example. There are 14 billion such tables on the web, and, after filtering, about 154 million of them are interesting enough to be worth indexing.

Can Google Dig into the Deep Web?

The question that remains is whether or not Google’s current search engine technology is going to be adept at doing all the different types of Deep Web indexing or if they will need to come up with something new. As of now, Google uses the Big Table database and MapReduce framework for everything search related, notes Alex Esterkin, Chief Architect at Infobright, Inc., a company delivering open source data warehousing solutions. During the talk, Halevy listed a number of analytical database application challenges that Google is currently dealing with: schema auto-complete, synonym discovery, creating entity lists, association between instances and aspects, and data level synonyms discovery. These challenges are addressed by Infobright’s technology, said Esterkin, but “Google will have to solve these problems the hard way.”

Also mentioned during the speech was how Google plans to organize “aspects” of search queries. The company wants to be able to separate exploratory queries (e.g., “Vietnam travel”) from ones where a user is in search of a particular fact (“Vietnam population”). The former query should deliver information about visa requirements, weather and tour packages, etc. In a way, this is like what the search service offered by Kosmix is doing. But Google wants to go further, said Halevy. “Kosmix will give you an ‘aspect,’ but it’s attached to an information source. In our case, all the aspects might be just Web search results, but we’d organize them differently.”

Yahoo Working on Similar Structured Data Retrieval

The challenges facing Google today are also being addressed by their nearest competitor in search, Yahoo. In December, Yahoo announced that they were taking their SearchMonkey technology in-house to automate the extraction of structured information from large classes of web sites. The results of that in-house extraction technique will allow Yahoo to augment their Yahoo Search results with key information returned alongside the URLs.

In this aspect of web search, it’s clear that no single company has yet to dominate. However, even if a non-Google company surges ahead, it may not be enough to get people to switch engines. Today, “Google” has become synonymous with web search, just like “Kleenex” is a tissue, “Band-Aid” is an adhesive bandage, and “Xerox” is a way to make photocopies. Once that psychological mark has been made into our collective psyches and the habit formed, people tend to stick with what they know, regardless of who does it better. That’s something that’s a bit troublesome – if better search technology for indexing the Deep Web comes into existence outside of Google, the world may not end up using it until such point Google either duplicates or acquires the invention.

Still, it’s far too soon to write Google off yet. They clearly have a lead when it comes to search and that came from hard work, incredibly smart people, and innovative technical achievements. No doubt they can figure out this Deep Web thing, too. (We hope).

Read Full Post »

Tech Biz  :  IT   

Murdoch Calls Google, Yahoo Copyright Thieves — Is He Right?

By David Kravets EmailApril 03, 2009 | 5:00:18 PMCategories: Intellectual Property  

Murdoch_2 Rupert Murdoch, the owner of News Corp. and The Wall Street Journal, says Google and Yahoo are giant copyright scofflaws that steal the news.

“The question is, should we be allowing Google to steal all our copyright … not steal, but take,” Murdoch says. “Not just them, but Yahoo.”

But whether search-engine news aggregation is theft or a protected fair use under copyright law is unclear, even as Google and Yahoo profit tremendously from linking to news. So maybe Murdoch is right.

Murdoch made his comments late Thursday during an address at the Cable Show, an industry event held in Washington. He seemingly was blaming the web, and search engines, for the news media’s ills.

“People reading news for free on the web, that’s got to change,” he said.

Real estate magnate Sam Zell made similar comments in 2007 when he took over the Tribune Company and ran it into bankruptcy.

We suspect Zell and Murdoch are just blowing smoke. If they were not, perhaps they could demand Google and Yahoo remove their news content. The search engines would kindly oblige.

Better yet, if Murdoch and Zell are so set on monetizing their web content, they should sue the search engines and claim copyright violations in a bid to get the engines to pay for the content.

The outcome of such a lawsuit is far from clear.

It’s unsettled whether search engines have a valid fair use claim under the Digital Millennium Copyright Act. The news headlines are copied verbatim, as are some of the snippets that go along.

Fred von Lohmann of the Electronic Frontier Foundation points out that “There’s not a rock-solid ruling on the question.”

Should the search engines pay up for the content? Tell us what you think.

Read Full Post »

 

BusinessWeek logo

 

Special Report April 9, 2007, 12:01AM EST text size: TT

Q&A with Tim Berners-Lee

The inventor of the Web explains how the new Semantic Web could have profound effects on the growth of knowledge and innovation

Tim Berners-Lee is far from finished with the World Wide Web. Having invented the Web in 1989, he’s now working on ways to make it a whole lot smarter.

For the last decade or so, as director of the World Wide Web Consortium (W3C), Berners-Lee has been working on an effort he’s dubbed the “Semantic Web.” At the heart of the Semantic Web is technology that makes it easier for people to find and correlate the information they need, whether that data resides on a Web site, in a corporate database, or in desktop software.

The Semantic Web, as Berners-Lee envisions it, represents a change so profound that it’s not always easy for others to grasp. This isn’t the first time he’s encountered that problem. “It was really hard explaining the Web before people just got used to it because they didn’t even have words like click and jump and page,” Berners-Lee says. In a recent conversation with BusinessWeek.com writer Rachael King, Berners-Lee discussed his vision for the Semantic Web and how it can alter the way companies operate. Edited excerpts follow.

It seems one of the problems the Semantic Web can solve is helping unlock information in various silos, in different software applications, and different places that currently cannot be connected easily.

Exactly. When you use the word “silos,” that’s the word we hear when somebody in the enterprise talks about the stovepipe problem. Different words for the same problem: that business information inside the company is managed by different sorts of software, and you have to go to a different person and learn a different program to see it. Any enterprise CEO really ought to be able to ask a question that involves connecting data across the organization, be able to run a company effectively, and especially to be able to respond to unexpected events. Most organizations are missing this ability to connect all the data together.

Even outside data can be integrated, as I understand it.

Absolutely. Anybody making real decisions uses data from many sources, produced by many sorts of organizations, and we’re stymied. We tend to have to use backs of envelopes to do this and people have to put data in spreadsheets, which they painfully prepare. In a way, the Semantic Web is a bit like having all the databases out there as one big database. It’s difficult to imagine the power that you’re going to have when so many different sorts of data are available.

It seems to me that we’re overwhelmed with data and this might be a good way to help us find the data we need.

When you can treat something as data, your querying can be much more powerful.

In your speech at Princeton last year, you said that maybe you had made a mistake in naming it the Semantic Web. Do you think the name confuses some people?

I don’t think it’s a very good name but we’re stuck with it now. The word semantics is used by different groups to mean different things. But now people understand that the Semantic Web is the Data Web. I think we could have called it the Data Web. It would have been simpler. I got in a lot of trouble for calling the World Wide Web “www” because it was so long and difficult to pronounce. At the end, when people understand what it is, they understand that it connects all applications together or gives them access to data across the company when they see a few general Semantic Web applications.

Some of the early work with the Semantic Web seems to have been done by government agencies such as the Defense Advanced Research Projects Agency and the National Aeronautics & Space Administration. Why do you think the government has been an early adopter of this technology?

I understand that DARPA had its own serious problems with huge amounts of data from all different sources about all sorts of things. So, they saw the Semantic Web rightly as something that was aimed directly at solving the problems they had on a large scale. I know that DARPA then funded some of the early development.

You have touched on the idea that the Semantic Web will make it easier to discover cures for diseases. How will it do that?

Well, when a drug company looks at a disease, they take the specific symptoms that are connected with specific proteins inside a human cell which might lead to those symptoms. So the art of finding the drug is to find the chemical that will interfere with the bad things happening and encourage the good things happening inside the cell, which involves understanding the genetics and all the connections between the proteins and the symptoms of the disease.

It also requires looking at all the other connections, whether there are federal regulations about the use of the protein and how it’s been used before. We’ve got government regulatory information, clinical trial data, the genomics data, and the proteomics data that are all in different departments and different pieces of software. A scientist who is going through that creative process of brainstorming to find something that could possibly solve the disease has to somehow keep everything in their head at the same time or be able to explore all these different axes in a connected way. The Semantic Web is a technology designed to specifically do that—to open up the boundaries between the silos, to allow scientists to explore hypotheses, to look at how things connect in new combinations that have never before been dreamt of.

The Semantic Web makes it so much easier to find and correlate information about nearly anything, including people. What happens if that information gets into the wrong hands? Is there anything that can be done to safeguard privacy?

Here at [MIT], we are doing research and building systems that are aware of the social issues. They are aware of privacy constraints, of the appropriate uses of information. We think it’s important to build systems that help you do the right thing, but also we’re building systems that, when they take data from many, many sources and combine it and allow you to come to a conclusion, are transparent in the sense that you can ask them what they based their decision on and they can go back and you can check if these are things that are appropriate to use and that you feel are trustworthy.

Developing Semantic Web standards has taken years. Has it taken a long time because the Semantic Web is so complex?

The Semantic Web isn’t inherently complex. The Semantic Web language, at its heart, is very, very simple. It’s just about the relationships between things.

Read Full Post »

WIRED MAGAZINE: 16.03

Tech Biz  :  IT   RSS

Free! Why $0.00 Is the Future of Business

By Chris Anderson Email 02.25.08 | 12:00 AM

At the age of 40, King Gillette was a frustrated inventor, a bitter anticapitalist, and a salesman of cork-lined bottle caps. It was 1895, and despite ideas, energy, and wealthy parents, he had little to show for his work. He blamed the evils of market competition. Indeed, the previous year he had published a book, The Human Drift, which argued that all industry should be taken over by a single corporation owned by the public and that millions of Americans should live in a giant city called Metropolis powered by Niagara Falls. His boss at the bottle cap company, meanwhile, had just one piece of advice: Invent something people use and throw away.One day, while he was shaving with a straight razor that was so worn it could no longer be sharpened, the idea came to him. What if the blade could be made of a thin metal strip? Rather than spending time maintaining the blades, men could simply discard them when they became dull. A few years of metallurgy experimentation later, the disposable-blade safety razor was born. But it didn’t take off immediately. In its first year, 1903, Gillette sold a total of 51 razors and 168 blades. Over the next two decades, he tried every marketing gimmick he could think of. He put his own face on the package, making him both legendary and, some people believed, fictional. He sold millions of razors to the Army at a steep discount, hoping the habits soldiers developed at war would carry over to peacetime. He sold razors in bulk to banks so they could give them away with new deposits (“shave and save” campaigns). Razors were bundled with everything from Wrigley’s gum to packets of coffee, tea, spices, and marshmallows. The freebies helped to sell those products, but the tactic helped Gillette even more. By giving away the razors, which were useless by themselves, he was creating demand for disposable blades. A few billion blades later, this business model is now the foundation of entire industries: Give away the cell phone, sell the monthly plan; make the videogame console cheap and sell expensive games; install fancy coffeemakers in offices at no charge so you can sell managers expensive coffee sachets.

Chris Anderson discusses “Free.”

Video produced by Annaliza Savage and edited by Michael Lennon.

Thanks to Gillette, the idea that you can make money by giving something away is no longer radical. But until recently, practically everything “free” was really just the result of what economists would call a cross-subsidy: You’d get one thing free if you bought another, or you’d get a product free only if you paid for a service.

Over the past decade, however, a different sort of free has emerged. The new model is based not on cross-subsidies — the shifting of costs from one product to another — but on the fact that the cost of products themselves is falling fast. It’s as if the price of steel had dropped so close to zero that King Gillette could give away both razor and blade, and make his money on something else entirely. (Shaving cream?)

You know this freaky land of free as the Web. A decade and a half into the great online experiment, the last debates over free versus pay online are ending. In 2007 The New York Times went free; this year, so will much of The Wall Street Journal. (The remaining fee-based parts, new owner Rupert Murdoch announced, will be “really special … and, sorry to tell you, probably more expensive.” This calls to mind one version of Stewart Brand’s original aphorism from 1984: “Information wants to be free. Information also wants to be expensive … That tension will not go away.”)

Once a marketing gimmick, free has emerged as a full-fledged economy. Offering free music proved successful for Radiohead, Trent Reznor of Nine Inch Nails, and a swarm of other bands on MySpace that grasped the audience-building merits of zero. The fastest-growing parts of the gaming industry are ad-supported casual games online and free-to-try massively multiplayer online games. Virtually everything Google does is free to consumers, from Gmail to Picasa to GOOG-411.

The rise of “freeconomics” is being driven by the underlying technologies that power the Web. Just as Moore’s law dictates that a unit of processing power halves in price every 18 months, the price of bandwidth and storage is dropping even faster. Which is to say, the trend lines that determine the cost of doing business online all point the same way: to zero.

But tell that to the poor CIO who just shelled out six figures to buy another rack of servers. Technology sure doesn’t feel free when you’re buying it by the gross. Yet if you look at it from the other side of the fat pipe, the economics change. That expensive bank of hard drives (fixed costs) can serve tens of thousands of users (marginal costs). The Web is all about scale, finding ways to attract the most users for centralized resources, spreading those costs over larger and larger audiences as the technology gets more and more capable. It’s not about the cost of the equipment in the racks at the data center; it’s about what that equipment can do. And every year, like some sort of magic clockwork, it does more and more for less and less, bringing the marginal costs of technology in the units that we individuals consume closer to zero.

Photo Illustration: Jeff Mermelstein

As much as we complain about how expensive things are getting, we’re surrounded by forces that are making them cheaper. Forty years ago, the principal nutritional problem in America was hunger; now it’s obesity, for which we have the Green Revolution to thank. Forty years ago, charity was dominated by clothing drives for the poor. Now you can get a T-shirt for less than the price of a cup of coffee, thanks to China and global sourcing. So too for toys, gadgets, and commodities of every sort. Even cocaine has pretty much never been cheaper (globalization works in mysterious ways).

Digital technology benefits from these dynamics and from something else even more powerful: the 20th-century shift from Newtonian to quantum machines. We’re still just beginning to exploit atomic-scale effects in revolutionary new materials — semiconductors (processing power), ferromagnetic compounds (storage), and fiber optics (bandwidth). In the arc of history, all three substances are still new, and we have a lot to learn about them. We are just a few decades into the discovery of a new world.

What does this mean for the notion of free? Well, just take one example. Last year, Yahoo announced that Yahoo Mail, its free webmail service, would provide unlimited storage. Just in case that wasn’t totally clear, that’s “unlimited” as in “infinite.” So the market price of online storage, at least for email, has now fallen to zero (see “Webmail Windfall“). And the stunning thing is that nobody was surprised; many had assumed infinite free storage was already the case.

For good reason: It’s now clear that practically everything Web technology touches starts down the path to gratis, at least as far as we consumers are concerned. Storage now joins bandwidth (YouTube: free) and processing power (Google: free) in the race to the bottom. Basic economics tells us that in a competitive market, price falls to the marginal cost. There’s never been a more competitive market than the Internet, and every day the marginal cost of digital information comes closer to nothing.

One of the old jokes from the late-’90s bubble was that there are only two numbers on the Internet: infinity and zero. The first, at least as it applied to stock market valuations, proved false. But the second is alive and well. The Web has become the land of the free.

The result is that we now have not one but two trends driving the spread of free business models across the economy. The first is the extension of King Gillette’s cross-subsidy to more and more industries. Technology is giving companies greater flexibility in how broadly they can define their markets, allowing them more freedom to give away products or services to one set of customers while selling to another set. Ryanair, for instance, has disrupted its industry by defining itself more as a full-service travel agency than a seller of airline seats (see “How Can Air Travel Be Free?”).

The second trend is simply that anything that touches digital networks quickly feels the effect of falling costs. There’s nothing new about technology’s deflationary force, but what is new is the speed at which industries of all sorts are becoming digital businesses and thus able to exploit those economics. When Google turned advertising into a software application, a classic services business formerly based on human economics (things get more expensive each year) switched to software economics (things get cheaper). So, too, for everything from banking to gambling. The moment a company’s primary expenses become things based in silicon, free becomes not just an option but the inevitable destination.

WASTE AND WASTE AGAIN
Forty years ago, Caltech professor Carver Mead identified the corollary to Moore’s law of ever-increasing computing power. Every 18 months, Mead observed, the price of a transistor would halve. And so it did, going from tens of dollars in the 1960s to approximately 0.000001 cent today for each of the transistors in Intel’s latest quad-core. This, Mead realized, meant that we should start to “waste” transistors.

Waste is a dirty word, and that was especially true in the IT world of the 1970s. An entire generation of computer professionals had been taught that their job was to dole out expensive computer resources sparingly. In the glass-walled facilities of the mainframe era, these systems operators exercised their power by choosing whose programs should be allowed to run on the costly computing machines. Their role was to conserve transistors, and they not only decided what was worthy but also encouraged programmers to make the most economical use of their computer time. As a result, early developers devoted as much code as possible to running their core algorithms efficiently and gave little thought to user interface. This was the era of the command line, and the only conceivable reason someone might have wanted to use a computer at home was to organize recipe files. In fact, the world’s first personal computer, a stylish kitchen appliance offered by Honeywell in 1969, came with integrated counter space.

Photo Illustration: Jeff Mermelstein

And here was Mead, telling programmers to embrace waste. They scratched their heads — how do you waste computer power? It took Alan Kay, an engineer working at Xerox’s Palo Alto Research Center, to show them. Rather than conserve transistors for core processing functions, he developed a computer concept — the Dynabook — that would frivolously deploy silicon to do silly things: draw icons, windows, pointers, and even animations on the screen. The purpose of this profligate eye candy? Ease of use for regular folks, including children. Kay’s work on the graphical user interface became the inspiration for the Xerox Alto, and then the Apple Macintosh, which changed the world by opening computing to the rest of us. (We, in turn, found no shortage of things to do with it; tellingly, organizing recipes was not high on the list.)

Of course, computers were not free then, and they are not free today. But what Mead and Kay understood was that the transistors in them — the atomic units of computation — would become so numerous that on an individual basis, they’d be close enough to costless that they might as well be free. That meant software writers, liberated from worrying about scarce computational resources like memory and CPU cycles, could become more and more ambitious, focusing on higher-order functions such as user interfaces and new markets such as entertainment. And that meant software of broader appeal, which brought in more users, who in turn found even more uses for computers. Thanks to that wasteful throwing of transistors against the wall, the world was changed.

What’s interesting is that transistors (or storage, or bandwidth) don’t have to be completely free to invoke this effect. At a certain point, they’re cheap enough to be safely disregarded. The Greek philosopher Zeno wrestled with this concept in a slightly different context. In Zeno’s dichotomy paradox, you run toward a wall. As you run, you halve the distance to the wall, then halve it again, and so on. But if you continue to subdivide space forever, how can you ever actually reach the wall? (The answer is that you can’t: Once you’re within a few nanometers, atomic repulsion forces become too strong for you to get any closer.)

In economics, the parallel is this: If the unitary cost of technology (“per megabyte” or “per megabit per second” or “per thousand floating-point operations per second”) is halving every 18 months, when does it come close enough to zero to say that you’ve arrived and can safely round down to nothing? The answer: almost always sooner than you think.

What Mead understood is that a psychological switch should flip as things head toward zero. Even though they may never become entirely free, as the price drops there is great advantage to be had in treating them as if they were free. Not too cheap to meter, as Atomic Energy Commission chief Lewis Strauss said in a different context, but too cheap to matter. Indeed, the history of technological innovation has been marked by people spotting such price and performance trends and getting ahead of them.

From the consumer’s perspective, though, there is a huge difference between cheap and free. Give a product away and it can go viral. Charge a single cent for it and you’re in an entirely different business, one of clawing and scratching for every customer. The psychology of “free” is powerful indeed, as any marketer will tell you.

This difference between cheap and free is what venture capitalist Josh Kopelman calls the “penny gap.” People think demand is elastic and that volume falls in a straight line as price rises, but the truth is that zero is one market and any other price is another. In many cases, that’s the difference between a great market and none at all.

The huge psychological gap between “almost zero” and “zero” is why micropayments failed. It’s why Google doesn’t show up on your credit card. It’s why modern Web companies don’t charge their users anything. And it’s why Yahoo gives away disk drive space. The question of infinite storage was not if but when. The winners made their stuff free first.

Traditionalists wring their hands about the “vaporization of value” and “demonetization” of entire industries. The success of craigslist’s free listings, for instance, has hurt the newspaper classified ad business. But that lost newspaper revenue is certainly not ending up in the craigslist coffers. In 2006, the site earned an estimated $40 million from the few things it charges for. That’s about 12 percent of the $326 million by which classified ad revenue declined that year.

But free is not quite as simple — or as stupid — as it sounds. Just because products are free doesn’t mean that someone, somewhere, isn’t making huge gobs of money. Google is the prime example of this. The monetary benefits of craigslist are enormous as well, but they’re distributed among its tens of thousands of users rather than funneled straight to Craig Newmark Inc. To follow the money, you have to shift from a basic view of a market as a matching of two parties — buyers and sellers — to a broader sense of an ecosystem with many parties, only some of which exchange cash.

The most common of the economies built around free is the three-party system. Here a third party pays to participate in a market created by a free exchange between the first two parties. Sound complicated? You’re probably experiencing it right now. It’s the basis of virtually all media.

In the traditional media model, a publisher provides a product free (or nearly free) to consumers, and advertisers pay to ride along. Radio is “free to air,” and so is much of television. Likewise, newspaper and magazine publishers don’t charge readers anything close to the actual cost of creating, printing, and distributing their products. They’re not selling papers and magazines to readers, they’re selling readers to advertisers. It’s a three-way market.

In a sense, what the Web represents is the extension of the media business model to industries of all sorts. This is not simply the notion that advertising will pay for everything. There are dozens of ways that media companies make money around free content, from selling information about consumers to brand licensing, “value-added” subscriptions, and direct ecommerce (see How-To Wiki for a complete list). Now an entire ecosystem of Web companies is growing up around the same set of models.

A TAXONOMY OF FREE
Between new ways companies have found to subsidize products and the falling cost of doing business in a digital age, the opportunities to adopt a free business model of some sort have never been greater. But which one? And how many are there? Probably hundreds, but the priceless economy can be broken down into six broad categories:

· “Freemium”
What’s free: Web software and services, some content. Free to whom: users of the basic version.

This term, coined by venture capitalist Fred Wilson, is the basis of the subscription model of media and is one of the most common Web business models. It can take a range of forms: varying tiers of content, from free to expensive, or a premium “pro” version of some site or software with more features than the free version (think Flickr and the $25-a-year Flickr Pro).

Again, this sounds familiar. Isn’t it just the free sample model found everywhere from perfume counters to street corners? Yes, but with a pretty significant twist. The traditional free sample is the promotional candy bar handout or the diapers mailed to a new mother. Since these samples have real costs, the manufacturer gives away only a tiny quantity — hoping to hook consumers and stimulate demand for many more.

Photo Illustration: Jeff Mermelstein

But for digital products, this ratio of free to paid is reversed. A typical online site follows the 1 Percent Rule — 1 percent of users support all the rest. In the freemium model, that means for every user who pays for the premium version of the site, 99 others get the basic free version. The reason this works is that the cost of serving the 99 percent is close enough to zero to call it nothing.

· Advertising
What’s free: content, services, software, and more. Free to whom: everyone.

Broadcast commercials and print display ads have given way to a blizzard of new Web-based ad formats: Yahoo’s pay-per-pageview banners, Google’s pay-per-click text ads, Amazon’s pay-per-transaction “affiliate ads,” and site sponsorships were just the start. Then came the next wave: paid inclusion in search results, paid listing in information services, and lead generation, where a third party pays for the names of people interested in a certain subject. Now companies are trying everything from product placement (PayPerPost) to pay-per-connection on social networks like Facebook. All of these approaches are based on the principle that free offerings build audiences with distinct interests and expressed needs that advertisers will pay to reach.

· Cross-subsidies
What’s free: any product that entices you to pay for something else. Free to whom: everyone willing to pay eventually, one way or another.

When Wal-Mart charges $15 for a new hit DVD, it’s a loss leader. The company is offering the DVD below cost to lure you into the store, where it hopes to sell you a washing machine at a profit. Expensive wine subsidizes food in a restaurant, and the original “free lunch” was a gratis meal for anyone who ordered at least one beer in San Francisco saloons in the late 1800s. In any package of products and services, from banking to mobile calling plans, the price of each individual component is often determined by psychology, not cost. Your cell phone company may not make money on your monthly minutes — it keeps that fee low because it knows that’s the first thing you look at when picking a carrier — but your monthly voicemail fee is pure profit.

On a busy corner in São Paulo, Brazil, street vendors pitch the latest “tecnobrega” CDs, including one by a hot band called Banda Calypso. Like CDs from most street vendors, these did not come from a record label. But neither are they illicit. They came directly from the band. Calypso distributes masters of its CDs and CD liner art to street vendor networks in towns it plans to tour, with full agreement that the vendors will copy the CDs, sell them, and keep all the money. That’s OK, because selling discs isn’t Calypso’s main source of income. The band is really in the performance business — and business is good. Traveling from town to town this way, preceded by a wave of supercheap CDs, Calypso has filled its shows and paid for a private jet.

The vendors generate literal street cred in each town Calypso visits, and its omnipresence in the urban soundscape means that it gets huge crowds to its rave/dj/concert events. Free music is just publicity for a far more lucrative tour business. Nobody thinks of this as piracy.

· Zero marginal cost
What’s free: things that can be distributed without an appreciable cost to anyone. Free to whom: everyone.

This describes nothing so well as online music. Between digital reproduction and peer-to-peer distribution, the real cost of distributing music has truly hit bottom. This is a case where the product has become free because of sheer economic gravity, with or without a business model. That force is so powerful that laws, guilt trips, DRM, and every other barrier to piracy the labels can think of have failed. Some artists give away their music online as a way of marketing concerts, merchandise, licensing, and other paid fare. But others have simply accepted that, for them, music is not a moneymaking business. It’s something they do for other reasons, from fun to creative expression. Which, of course, has always been true for most musicians anyway.

· Labor exchange
What’s free: Web sites and services. Free to whom: all users, since the act of using these sites and services actually creates something of value.

You can get free porn if you solve a few captchas, those scrambled text boxes used to block bots. What you’re actually doing is giving answers to a bot used by spammers to gain access to other sites — which is worth more to them than the bandwidth you’ll consume browsing images. Likewise for rating stories on Digg, voting on Yahoo Answers, or using Google’s 411 service (see “How Can Directory Assistance Be Free?”). In each case, the act of using the service creates something of value, either improving the service itself or creating information that can be useful somewhere else.

· Gift economy
What’s free: the whole enchilada, be it open source software or user-generated content. Free to whom: everyone.

From Freecycle (free secondhand goods for anyone who will take them away) to Wikipedia, we are discovering that money isn’t the only motivator. Altruism has always existed, but the Web gives it a platform where the actions of individuals can have global impact. In a sense, zero-cost distribution has turned sharing into an industry. In the monetary economy it all looks free — indeed, in the monetary economy it looks like unfair competition — but that says more about our shortsighted ways of measuring value than it does about the worth of what’s created.

THE ECONOMICS OF ABUNDANCE
Enabled by the miracle of abundance, digital economics has turned traditional economics upside down. Read your college textbook and it’s likely to define economics as “the social science of choice under scarcity.” The entire field is built on studying trade-offs and how they’re made. Milton Friedman himself reminded us time and time again that “there’s no such thing as a free lunch.

“But Friedman was wrong in two ways. First, a free lunch doesn’t necessarily mean the food is being given away or that you’ll pay for it later — it could just mean someone else is picking up the tab. Second, in the digital realm, as we’ve seen, the main feedstocks of the information economy — storage, processing power, and bandwidth — are getting cheaper by the day. Two of the main scarcity functions of traditional economics — the marginal costs of manufacturing and distribution — are rushing headlong to zip. It’s as if the restaurant suddenly didn’t have to pay any food or labor costs for that lunch.

Surely economics has something to say about that?

It does. The word is externalities, a concept that holds that money is not the only scarcity in the world. Chief among the others are your time and respect, two factors that we’ve always known about but have only recently been able to measure properly. The “attention economy” and “reputation economy” are too fuzzy to merit an academic department, but there’s something real at the heart of both. Thanks to Google, we now have a handy way to convert from reputation (PageRank) to attention (traffic) to money (ads). Anything you can consistently convert to cash is a form of currency itself, and Google plays the role of central banker for these new economies.

There is, presumably, a limited supply of reputation and attention in the world at any point in time. These are the new scarcities — and the world of free exists mostly to acquire these valuable assets for the sake of a business model to be identified later. Free shifts the economy from a focus on only that which can be quantified in dollars and cents to a more realistic accounting of all the things we truly value today.

FREE CHANGES EVERYTHING
Between digital economics and the wholesale embrace of King’s Gillette’s experiment in price shifting, we are entering an era when free will be seen as the norm, not an anomaly. How big a deal is that? Well, consider this analogy: In 1954, at the dawn of nuclear power, Lewis Strauss, head of the Atomic Energy Commission, promised that we were entering an age when electricity would be “too cheap to meter.” Needless to say, that didn’t happen, mostly because the risks of nuclear energy hugely increased its costs. But what if he’d been right? What if electricity had in fact become virtually free?The answer is that everything electricity touched — which is to say just about everything — would have been transformed. Rather than balance electricity against other energy sources, we’d use electricity for as many things as we could — we’d waste it, in fact, because it would be too cheap to worry about.

All buildings would be electrically heated, never mind the thermal conversion rate. We’d all be driving electric cars (free electricity would be incentive enough to develop the efficient battery technology to store it). Massive desalination plants would turn seawater into all the freshwater anyone could want, irrigating vast inland swaths and turning deserts into fertile acres, many of them making biofuels as a cheaper store of energy than batteries. Relative to free electrons, fossil fuels would be seen as ludicrously expensive and dirty, and so carbon emissions would plummet. The phrase “global warming” would have never entered the language.

Today it’s digital technologies, not electricity, that have become too cheap to meter. It took decades to shake off the assumption that computing was supposed to be rationed for the few, and we’re only now starting to liberate bandwidth and storage from the same poverty of imagination. But a generation raised on the free Web is coming of age, and they will find entirely new ways to embrace waste, transforming the world in the process. Because free is what you want — and free, increasingly, is what you’re going to get.

Chris Anderson (canderson@wired.com) is the editor in chief of Wired and author of The Long Tail. His next book, FREE, will be published in 2009 by Hyperion.

Search Wired

Top Stories Magazine Wired Blogs All Wired

Related Topics:

Comments (63)

Posted by: danielu23 hours ago1 Point
Your “Scenario 1” implies you know absolutely NOTHING about the movie business. Distributors and Studios make the money on ticket sales based on a percentage split with the projection houses. The bulk of ticket sales money goes to the Distributors a…
Posted by: tom2032 days ago1 Point
The information is not free, it is being paid for (in cash) mostly by advertisers trying to gain the attention of the website visitors. It is also paid for (in time wasted) by the people who are constantly distracted by the ads. Micro-payments were…
Posted by: mfouts2 days ago1 Point
That article is absolutley amazing!!! I am currently into buying real estate and I am slowly transitioning into the great world wide web. @ of my partners and I are trying to take advantage of the the www world via http://www.choiceisfreedom.com still under…
Posted by: foofah2 days ago1 Point
Great article…but give poor Zeno a break. “The answer is that you can’t [reach the wall]: Once you’re within a few nanometers, atomic repulsion forces become too strong for you to get any closer.” You’ve either missed Zeno’s point entirely, or you’…
Posted by: RainerGamer2 days ago1 Point
Sign me up.
Posted by: Lord_Jim2 days ago1 Point
Is something really free only because you don’t pay in dollars? What about being bombarded with advertising? What about giving away personal data to dubious parties? What about costly ‘upgrade options’ hidden behind every second button of allegedly …
Posted by: gdavis951293 days ago1 Point
Please Mr. Anderson, buy yourself a dictionary. You write: …Yahoo announced that Yahoo Mail… would provide unlimited storage. Just in case that wasn’t totally clear, that’s “unlimited” as in “infinite”. ‘Unlimited’ means that Yahoo will not cap t…
Posted by: MikeG3 days ago1 Point
A few months ago I began researching free training & education. To be honest, I didn’t expect to find many good, free items, since I know that it takes time and effort (and time is money) to develop training. But I hoped my efforts would unearth …
Posted by: RAGZ3 days ago1 Point
You know, I subscribe to Wired, and I like the content, but please answer this question; why am I paying Wired’s comparatively high subscription cost if you’re going to stuff it so full of little ad inserts, that when I open it during my bathroom rit…
Posted by: jdwright103 days ago1 Point
This definitely true. It’s a pretty good strategy if you think about it. I just bought a $7 Gillette razor and the refill blades cost me $15!
Posted by: gdavis951293 days ago1 Point

Read Full Post »

Older Posts »

%d bloggers like this: