Archive for the ‘Roadmap’ Category

From Logic to Ontology: The limit of “The Semantic Web”



(Some post are written in English and Spanish language) 


From Logic to Ontology: The limit of “The Semantic Web” 


If you read the next posts on this blog: 

Semantic Web

The Semantic Web

What is the Semantic Web, Actually?

The Metaweb: Beyond Weblogs. From the Metaweb to the Semantic Web: A Roadmap

Semantics to the people! ontoworld

What’s next for the Internet

Web 3.0: Update

How the Wikipedia 3.0: The End of Google? article reached 2 million people in 4 days!

Google vs Web 3.0

Google dont like Web 3.0 [sic] Why am I not surprised?

Designing a better Web 3.0 search engine

From semantic Web (3.0) to the WebOS (4.0)

Search By Meaning

A Web That Thinks Like You


The long-promised “semantic” web is starting to take shape

Start-Up Aims for Database to Automate Web Searching

Metaweb: a semantic wiki startup


The Semantic Web, Collective Intelligence and Hyperdata.

Informal logic 

Logical argument

Consistency proof 

Consistency proof and completeness: Gödel’s incompleteness theorems

Computability theory (computer science): The halting problem

Gödel’s incompleteness theorems: Relationship with computability

Non-formal or Inconsistency Logic: LACAN’s LOGIC. Gödel’s incompleteness theorems,

You will realize the internal relationship between them linked from Logic to Ontology.  

I am writing from now on an article about the existence of the semantic web.  

I will prove that it does not exist at all, and that it is impossible to build from machines like computers.  

It does not depend on the software and hardware you use to build it: You cannot do that at all! 

You will notice the internal relations among them, and the connecting thread is the title of this post: “Logic to ontology.”   

I will prove that there is no such construction, which can not be done from the machines, and that does not depend on the hardware or software used.  

More precisely, the limits of the semantic web are not set by the use of machines themselves and biological systems could be used to reach this goal, but as the logic that is being used to construct it does not contemplate the concept of time, since it is purely formal logic and metonymic lacks the metaphor, and that is what Gödel’s theorems remark, the final tautology of each construction or metonymic language (mathematical), which leads to inconsistencies. 

This consistent logic is completely opposite to the logic that makes inconsistent use of time, inherent of human unconscious, but the use of time is built on the lack, not on positive things, it is based on denials and absences, and that is impossible to reflect on a machine because of the perceived lack of the required self-awareness is acquired with the absence.  

The problem is we are trying to build an intelligent system to replace our way of thinking, at least in the information search, but the special nature of human mind is the use of time which lets human beings reach a conclusion, therefore does not exist in the human mind the halting problem or stop of calculation.  

So all efforts faced toward semantic web are doomed to failure a priori if the aim is to extend our human way of thinking into machines, they lack the metaphorical speech, because only a mathematical construction, which will always be tautological and metonymic, and lacks the use of the time that is what leads to the conclusion or “stop”.  

As a demonstration of that, if you suppose it is possible to construct the semantic web, as a language with capabilities similar to human language, which has the use of time, should we face it as a theorem, we can prove it to be false with a counter example, and it is given in the particular case of the Turing machine and “the halting problem”.  

Then as the necessary and sufficient condition for the theorem is not fulfilled, we still have the necessary condition that if a language uses time, it lacks formal logic, the logic used is inconsistent and therefore has no stop problem.

This is a necessary condition for the semantic web, but it is not enough and therefore no machine, whether it is a Turing Machine, a computer or a device as random as a black body related to physics field, can deal with any language other than mathematics language hence it is implied that this language is forced to meet the halting problem, a result of Gödel theorem.   

De la lógica a la ontología: El límite de la “web semántica”  

Si lee los siguientes artículos de este blog: 


Wikipedia 3.0: El fin de Google (traducción Spanish)


Lógica Consistente y completitud: Teoremas de la incompletitud de Gödel (Spanish)

Consistencia lógica (Spanish)

Teoría de la computabilidad. Ciencia de la computación.

Teoremas de la incompletitud de Gödel y teoría de la computación: Problema de la parada 

Lógica inconsistente e incompletitud: LOGICAS LACANIANAS y Teoremas de la incompletitud de Gödel (Spanish)  

Jacques Lacan (Encyclopædia Britannica Online)

Usted puede darse cuenta de las relaciones internas entre ellos, y el hilo conductor es el título de este mismo post: “de la lógica a la ontología”.  

Probaré que no existe en absoluto tal construcción, que no se puede hacer desde las máquinas, y que no depende ni del hardware ni del software utilizado.   

Matizando la cuestión, el límite de la web semántica está dado no por las máquinas y/o sistemas biológicos que se pudieran usar, sino porque la lógica con que se intenta construir carece del uso del tiempo, ya que la lógica formal es puramente metonímica y carece de la metáfora, y eso es lo que marcan los teoremas de Gödel, la tautología final de toda construcción y /o lenguaje metonímico (matemático), que lleva a contradicciones.  

Esta lógica consistente es opuesta a la lógica inconsistente que hace uso del tiempo, propia del insconciente humano, pero el uso del tiempo está construido en base a la falta, no en torno a lo positivo sino en base a negaciones y ausencias, y eso es imposible de reflejar en una máquina porque la percepción de la falta necesita de la conciencia de sí mismo que se adquiere con la ausencia.   

El problema está en que pretendemos construir un sistema inteligente que sustituya nuestro pensamiento, al menos en las búsquedas de información, pero la particularidad de nuestro pensamiento humano es el uso del tiempo el que permite concluir, por eso no existe en la mente humana el problema de la parada o detención del cálculo, o lo que es lo mismo ausencia del momento de concluir.  

Así que todos los esfuerzos encaminados a la web semántica están destinados al fracaso a priori si lo que se pretende es prolongar nuestro pensamiento humano en las máquinas, ellas carecen de discurso metafórico, pues sólo son una construcción matemática, que siempre será tautológica y metonímica, ya que además carece del uso del tiempo que es lo que lleva al corte, la conclusión o la “parada”.  

Como demostración vale la del contraejemplo, o sea que si suponemos que es posible construir la web semántica, como un lenguaje con capacidades similares al lenguaje humano, que tiene el uso del tiempo, entonces si ese es un teorema general, con un solo contraejemplo se viene abajo, y el contraejemplo está dado en el caso particular de la máquina de Turing y el “problema de la parada”.  

Luego no se cumple la condición necesaria y suficiente del teorema, nos queda la condición necesaria que es que si un lenguaje tiene el uso del tiempo, carece de lógica formal, usa la lógica inconsistente y por lo tanto no tiene el problema de la parada”, esa es condición necesaria para la web semántica, pero no suficiente y por ello ninguna máquina, sea de Turing, computador o dispositivo aleatorio como un cuerpo negro en física, puede alcanzar el uso de un lenguaje que no sea el matemático con la paradoja de la parada, consecuencia del teorema de Gödel.

Jacques Lacan (Encyclopædia Britannica Online)

Read Full Post »




By Lifeboat Foundation Scientific Advisory Board member Nova Spivack. To maximize propagation of this meme, its text is distributed under the Creative Commons Deed. Distributed versions should include a link to Minding the Planet. Print report!

Many years ago, in the late 1980s, while I was still a college student, I visited my late grandfather, Peter F. Drucker, at his home in Claremont, California. He lived near the campus of Claremont College where he was a professor emeritus. On that particular day, I handed him a manuscript of a book I was trying to write, entitled, “Minding the Planet” about how the Internet would enable the evolution of higher forms of collective intelligence.
My grandfather read my manuscript and later that afternoon we sat together on the outside back porch and he said to me, “One thing is certain: Someday, you will write this book.” We both knew that the manuscript I had handed him was not that book, a fact that was later verified when I tried to get it published. I gave up for a while and focused on college, where I was studying philosophy with a focus on artificial intelligence. And soon I started working in the fields of artificial intelligence and supercomputing at companies like Kurzweil, Thinking Machines, and Individual.
A few years later, I co-founded one of the early Web companies, EarthWeb, where among other things we built many of the first large commercial Websites and later helped to pioneer Java by creating several large knowledge-sharing communities for software developers. Along the way I continued to think about collective intelligence. EarthWeb and the first wave of the Web came and went. But this interest and vision continued to grow. In 2000 I started researching the necessary technologies to begin building a more intelligent Web. And eventually that led me to start my present company, Radar Networks, where we are now focused on enabling the next-generation of collective intelligence on the Web, using the new technologies of the Semantic Web. 
But ever since that day on the porch with my grandfather, I remembered what he said: “Someday, you will write this book.” I’ve tried many times since then to write it. But it never came out the way I had hoped. So I tried again. Eventually I let go of the book form and created this article instead. This paper is the first one that meets my own standards for what I really wanted to communicate. And so I dedicate this to my grandfather, who inspired me to keep writing this, and who gave me his prediction that I would one day complete it.
This is an article about a new generation of technology that is sometimes called the Semantic Web, and which could also be called the Intelligent Web, or the global mind. But what is the Semantic Web, and why does it matter, and how does it enable collective intelligence? And where is this all headed? And what is the long-term far future going to be like? Is the global mind just science fiction? Will a world that has a global mind be good place to live in, or will it be some kind of technological nightmare?
I’ve often joked that it is ironic that a term that contains the word “semantic” has such an ambiguous meaning for most people. Most people just have no idea what this means, they have no context for it, it is not connected to their experience and knowledge. This is a problem that people who are deeply immersed in the trenches of the Semantic Web have not been able to solve adequately — they have not found the words to communicate what they can clearly see, what they are working on, and why it matters for everyone.
In this article I have tried, and hopefully succeeded, in providing a detailed introduction and context for the Semantic Web for non-technical people. But even technical people working in the field may find something of interest here as I piece together the fragments into a Big Picture and a vision for what might be called “Semantic Web 2.0.”
I hope the reader will bear with me as I bounce around across different scales of technology and time, and from the extremes of core technology to wild speculation in order to tell this story.
If you are looking for the cold hard science of it all, this article will provide an understanding but will not satisfy your need for seeing the actual code; there are other places where you can find that level of detail and rigor. But if you want to understand what it all really means and what the opportunity and future looks like — this may be what you are looking for.
I should also note that all of this is my personal view of what I’ve been working on, and what it really means to me. It is not necessarily the official view of the mainstream academic Semantic Web community — although there are certainly many places where we all agree. But I’m sure that some readers will certainly disagree or raise objections to some of my assertions, and certainly to my many far-flung speculations about the future. I welcome those different perspectives; we’re all trying to make sense of this and the more of us who do that together, the more we can collectively start to really understand it. So please feel free to write your own vision or response, and please let me know so I can link to it!
So with this Prelude in mind, let’s get started…
The Semantic Web is a set of technologies which are designed to enable a particular vision for the future of the Web — a future in which all knowledge exists on the Web in a format that software applications can understand and reason about. By making knowledge more accessible to software, software will essentially become able to understand knowledge, think about knowledge, and create new knowledge. In other words, software will be able to be more intelligent — not as intelligent as humans perhaps, but more intelligent than say, your word processor is today.
The dream of making software more intelligent has been around almost as long as software itself. And although it is taking longer to materialize than past experts had predicted, progress towards this goal is being steadily made.
At the same time, the shape of this dream is changing. It is becoming more realistic and pragmatic. The original dream of artificial intelligence was that we would all have personal robot assistants doing all the work we don’t want to do for us. That is not the dream of the Semantic Web. Instead, today’s Semantic Web is about facilitating what humans do — it is about helping humans do things more intelligently. It’s not a vision in which humans do nothing and software does everything.
The Semantic Web vision is not just about helping software become smarter — it is about providing new technologies that enable people, groups, organizations and communities to be smarter.
For example, by providing individuals with tools that learn about what they know, and what they want, search can be much more accurate and productive.
Using software that is able to understand and automatically organize large collections of knowledge, groups, organizations and communities can reach higher levels of collective intelligence and they can cope with volumes of information that are just too great for individuals or even groups to comprehend on their own.
Another example: more efficient marketplaces can be enabled by software that learns about products, services, vendors, transactions and market trends and understands how to connect them together in optimal ways.
In short, the Semantic Web aims to make software smarter, not just for its own sake, but in order to help make people, and groups of people, smarter. In the original Semantic Web vision this fact was under-emphasized, leading to the impression that Semantic Web was only about automating the world. In fact, it is really about facilitating the world.

The Semantic Web is a blue ocean waiting to be explored.
The Semantic Web is one of the most significant things to happen since the Web itself.
But it will not appear overnight. It will take decades. It will grow in a bottom-up, grassroots, emergent, community-driven manner just like the Web itself. Many things have to converge for this trend to really take off.
The core open standards already exist, but the necessary development tools have to mature, the ontologies that define human knowledge have to come into being and mature, and most importantly we need a few real “killer apps” to prove the value and drive adoption of the Semantic Web paradigm. The first generation of the Web had its Mozilla, Netscape, Internet Explorer, and Apache — and it also had HTML, HTTP, a bunch of good development tools, and a few killer apps and services such as Yahoo! and thousands of popular Web sites. The same things are necessary for the Semantic Web to take off.
And this is where we are today — this all just about to start emerging. There are several companies racing to get this technology, or applications of it, to market in various forms.
Within a year or two you will see mass-consumer Semantic Web products and services hit the market, and within 5 years there will be at least a few “killer apps” of the Semantic Web. Ten years from now the Semantic Web will have spread into many of the most popular sites and applications on the Web. Within 20 years all content and applications on the Internet will be integrated with the Semantic Web. This is a sea-change. A big evolutionary step for the Web.
The Semantic Web is an opportunity to redefine, or perhaps to better define, all the content and applications on the Web. That’s a big opportunity. And within it there are many business opportunities and a lot of money to be made. It’s not unlike the opportunity of the first generation of the Web. There are platform opportunities, content opportunities, commerce opportunities, search opportunities, community and social networking opportunities, and collaboration opportunities in this space. There is room for a lot of players to compete and at this point the field is wide open.
The Semantic Web is a blue ocean waiting to be explored. And like any unexplored ocean its also has its share of reefs, pirate islands, hidden treasure, shoals, whirlpools, sea monsters and typhoons. But there are new worlds out there to be discovered, and they exert an irresistible pull on the imagination. This is an exciting frontier — and also one fraught with hard technical and social challenges that have yet to be solved. For early ventures in the Semantic Web arena, it’s not going to be easy, but the intellectual and technological challenges, and the potential financial rewards, glory, and benefit to society, are worth the effort and risk. And this is what all great technological revolutions are made of.
Some people who have heard the term “Semantic Web” thrown around too much may think it is a buzzword, and they are right. But it is not just a buzzword — it actually has some substance behind it. That substance hasn’t emerged yet, but it will.
Early critiques of the Semantic Web were right — the early vision did not leverage concepts such as folksonomy and user-contributed content at all. But that is largely because when the Semantic Web was originally conceived of Web 2.0 hadn’t happened yet. The early experiments that came out of research labs were geeky, to put it lightly, and impractical, but they are already being followed up by more pragmatic, user-friendly approaches.
Today’s Semantic Web — what we might call “Semantic Web 2.0” is a kinder, gentler, more social Semantic Web. It combines the best of the original vision with what we have all learned about social software and community in the last 10 years. Although much of this is still in the lab, it is already starting to trickle out. For example, recently Yahoo! started a pilot of the Semantic Web behind their food vertical. Other organizations are experimenting with using Semantic Web technology in parts of their applications, or to store or map data. But that’s just the beginning.
Entrepreneurs, venture capitalists and technologists are increasingly starting to see these opportunities. Who will be the “Google of the Semantic Web?” — will it be Google itself? That’s doubtful. Like any entrenched incumbent, Google is heavily tied to a particular technology and worldview. And in Google’s case it is anything but semantic today. It would be easier for an upstart to take this position than for Google to port their entire infrastructure and worldview to a Semantic Web way of thinking.
If it is going to be Google it will most likely be by acquisition rather than by internal origination. And this makes more sense anyway — for Google is in a position where they can just wait and buy the winner, at almost any price, rather than competing in the playing field.
One thing to note however is that Google has at least one product offering that shows some potential for becoming a key part of the Semantic Web. I am speaking of Google Base, Google’s open database which is meant to be a registry for structured data so that it can be found in Google search. But Google Base does not conform to or make use of the many open standards of the Semantic Web community. That may or may not be a good thing, depending on your perspective.
Of course the downside of Google waiting to join the mainstream Semantic Web community until after the winner is announced is very large — once there is a winner it may be too late for Google to beat them. The winner of the Semantic Web race could very well unseat Google. The strategists at Google are probably not yet aware of this but as soon as they see significant traction around a major Semantic Web play it will become of interest to them.
In any case, I think there won’t be just one winner, there will be several major Semantic Web companies in the future, focusing on different parts of the opportunity. And you can be sure that if Google gets into the game, every major portal will need to get into this space at some point or risk becoming irrelevant. There will be demand and many acquisitions. In many ways the Semantic Web will not be controlled by just one company — it will be more like a fabric that connects them all together.

Context is king!
It should be clear by now that the Semantic Web is all about enabling software (and people) to work with knowledge more intelligently. But what is knowledge?
Knowledge is not just information. It is meaningful information — it is information plus context. For example, if I simply say the word “sem” to you, it is just raw information, it is not knowledge. It probably has no meaning to you other than a particular set of letters that you recognize and a sound you can pronounce, and the mere fact that this information was stated by me.
But if I tell you that “sem” is the Tibetan word for “mind” then suddenly, “sem means mind in Tibetan” to you. If I further tell you that Tibetans have about as many words for “mind” as Eskimos have for “snow”, this is further meaning. This is context, in other words, knowledge, about the sound sem. The sound is raw information. When it is given context it becomes a word, a word that has meaning, a word that is connected to concepts in your mind — it becomes knowledge. By connecting raw information to context, knowledge is formed.
Once you have acquired a piece of knowledge such as “sem means mind in Tibetan,” you may then also form further knowledge about it. For example, you may form the memory, “Nova said that ‘sem means mind in Tibetan.'” You might also connect the word “sem” to networks of further concepts you have about Tibet and your understanding of what the word “mind” means.
The mind is the organ of meaning — mind is where meaning is stored, interpreted and created. Meaning is not “out there” in the world, it is purely subjective, it is purely mental. Meaning is almost equivalent to mind in fact. For the two never occur separately. Each of our individual minds has some way of internally representing meaning — when we read or hear a word that we know, our minds connect that to a network of concepts about it and at that moment it means something to us.
Digging deeper, if you are really curious, or you happen to know Greek, you may also find that a similar sound occurs in the Greek word, sēmantikós — which means “having meaning” and in turn is the root of the English word “semantic” which means “pertaining to or arising from meaning.”
That’s an odd coincidence! “Sem” occurs in Tibetan word for mind, and the English and Greek words that all relate to the concepts of “meaning” and “mind.” Even stranger is that not only do these words have a similar sound, they have a similar meaning.
With all this knowledge at your disposal, when you then see the term “Semantic Web” you may be able to infer that it has something to do with adding “meaning” to the Web. However, if you were a Tibetan, perhaps you might instead think the term had something to do with adding “mind” to the Web. In either case you would be right!

The Semantic Web will improve the connections between knowledge on the web and software.
We’ve discovered a new connection — namely that there is an implicit connection between “sem” in Greek, English and Tibetan: they all relate to meaning and mind. It’s not a direct, explicit connection — it’s not evident unless you dig for it. But it’s a useful tidbit of knowledge once it’s found. Unlike the direct migration of the sound “sem” from Greek to English, there may not have ever been a direct transfer of this sound from Greek to Sanskrit to Tibetan. But in a strange and unexpected way, they are all connected. This connection wasn’t necessarily explicitly stated by anyone before, but was uncovered by exploring our network of concepts and making inferences.
The sequence of thought about “sem” above is quite similar to the kind of intellectual reasoning and discovery that the actual Semantic Web seeks to enable software to do automatically.  How is this kind of reasoning and discovery enabled?
The Semantic Web provides a set of technologies for formally defining the context of information. Just as the Web relies on a standard formal specification for “marking up” information with formatting codes that enable any applications to understand those codes to format the information in the same way, the Semantic Web relies on new standards for “marking up” information with statements about its context — its meaning — that enable any applications to understand, and reason about, the meaning of those statements in the same way.
By applying semantic reasoning agents to large collections of semantically enhanced content, all sorts of new connections may be inferred, leading to new knowledge, unexpected discoveries and useful additional context around content. This kind of reasoning and discovery is already taking place in fields from drug discovery and medical research, to homeland security and intelligence. The Semantic Web is not the only way to do this — but it certainly will improve the process dramatically.
And of course, with this improvement will come new questions about how to assess and explain how various inferences were made, and how to protect privacy as our inferencing capabilities begin to extend across ever more sources of public and private data. I don’t have the answers to these questions, but others are working on them and I have confidence that solutions will be arrived at over time.
By marking up information with metadata that formally codifies its context, we can make the data itself “smarter”. The data becomes self-describing. When you get a piece of data you also get the necessary metadata for understanding it. For example, if I sent you a document containing the word “sem” in it, I could add markup around that word indicating that it is the word for “mind” in the Tibetan language.
Similarly, a document containing mentions of “Radar Networks” could contain metadata indicating that “Radar Networks” is an Internet company, not a product or a type of radar technology. A document about a person could contain semantic markup indicating that they are residents of a certain city, experts on Italian cooking, and members of a certain profession. All of this could be encoded as metadata in a form that software could easily understand. The data carries more information about its own meaning.
The alternative to smart data would be for software to actually read and understand natural language as well as humans. But that’s really hard. To correctly interpret raw natural language, software would have to be developed that knew as much as a human being.
But think about how much teaching and learning is required to raise a human being to the point where they can read at an adult level. It is likely that similar training would be necessary to build software that could do that. So far that goal has not been achieved, although some attempts have been made. While decent progress in natural language understanding has been made, most software that can do this is limited around particular vertical domains, and it’s brittle — it doesn’t do a good job of making sense of terms and forms of speech that it wasn’t trained to parse and make sense of.
Instead of trying to make software a million times smarter than it is today, it is much easier to just encode more metadata about what our information means. That turns out to be less work in the end. And there’s an added benefit to this approach — the meaning exists with the data and travels with it. It is independent of any one software program — all software can access it. And because the meaning of information is stored with the information itself, rather than in the software, the software doesn’t have to be enormous to be smart. It just has to know the basic language for interpreting the semantic metadata it finds on the information it works with.
Smart data enables relatively dumb software to be smarter with less work. That’s an immediate benefit. And in the long-term as software actually gets smarter, smart data will make it easier for it to start learning and exploring on its own. So it’s a win-win approach. Start with by adding semantic metadata to data, end up with smarter software.
Metadata comes down to making statements about the world in a manner that machines, and perhaps even humans, can understand unambiguously. The same piece of metadata should be interpreted in the same way by different applications and readers.
There are many kinds of statements that can be made about information to provide it with context. For example, you can state a definition such as “person” means “a human being or a legal entity.” You can state an assertion such as “Sue is a human being.” You can state a rule such that “if x is a human being, then x is a person.”
From these statements it can then be inferred that “Sue is a person.” This inference is so obvious to you and me that it seems trivial, but most software today cannot do this. It doesn’t know what a person is, let alone what a name is. But if software could do this, then it could for example, automatically organize documents by the people they are related to, or discover connections between people who were mentioned in a set of documents, or it could find documents about people who were related to particular topics, or it could give you a list of all the people mentioned in a set of documents, or all the documents related to a person.
Of course this is a very basic example. But imagine if your software didn’t just know about people — it knew about most of the common concepts that occur in your life. Your software would then be able to help you work with your documents just about as intelligently as you are able to do by yourself, or perhaps even more intelligently, because you are just one person and you have limited time and energy but your software could work all the time, and in parallel, to help you.
How could the existence of the Semantic Web and all the semantic metadata that defines it be really useful to everyone in the near-term?
Well, for example, the problem of email spam would finally be cured: your software would be able to look at a message and know whether it was meaningful and/or relevant to you or not.
Similarly, you would never have to file anything by hand again. Your software could automate all filing and information organization tasks for you because it would understand your information and your interests. It would be able to figure out when to file something in a single folder, multiple folders, or new ones. It would organize everything — documents, photos, contacts, bookmarks, notes, products, music, video, data records — and it would do it even better and more consistently than you could on your own. Your software wouldn’t just organize stuff, it would turn it into knowledge by connecting it to more context. It could do this not just for individuals, but for groups, organizations and entire communities.
Another example: search would be vastly better: you could search conversationally by typing in everyday natural language and you would get precisely what you asked for, or even what you needed but didn’t know how to ask for correctly, and nothing else. Your search engine could even ask you questions to help you narrow what you want. You would finally be able to converse with software in ordinary speech and it would understand you.
The process of discovery would be easier too. You could have a software agent that worked as your personal recommendation agent. It would constantly be looking in all the places you read or participate in for things that are relevant to your past, present and potential future interests and needs. It could then alert you in a contextually sensitive way, knowing how to reach you and how urgently to mark things. As you gave it feedback it could learn and do a better job over time.
Going even further with this, semantically-aware software — software that is aware of context, software that understands knowledge — isn’t just for helping you with your information, it can also help to enrich and facilitate, and even partially automate, your communication and commerce (when you want it to).
So for example, your software could help you with your email. It would be able to recommend responses to messages for you, or automate the process. It would be able to enrich your messaging and discussions by automatically cross-linking what you are speaking about with related messages, discussions, documents, Web sites, subject categories, people, organizations, places, events, etc.
Shopping and marketplaces would also become better — you could search precisely for any kind of product, with any specific attributes, and find it anywhere on the Web, in any store.
You could post classified ads and automatically get relevant matches according to your priorities, from all over the Web, or only from specific places and parties that match your criteria for who you trust. You could also easily invent a new custom data structure for posting classified ads for a new kind of product or service and publish it to the Web in a format that other Web services and applications could immediately mine and index without having to necessarily integrate with your software or data schema directly.
You could publish an entire database to the Web and other applications and services could immediately start to integrate your data with their data, without having to migrate your schema or their own. You could merge data from different data sources together to create new data sources without having to ever touch or look at an actual database schema.

The above examples illustrate the potential of the Semantic Web today, but the reality on the ground is that the technology is still in the early phases of evolution. Even for experienced software engineers and Web developers, it is difficult to apply in practice. The main obstacles are twofold:
(1) The Tools Problem:
There are very few commercial-grade tools for doing anything with the Semantic Web today — Most of the tools for building semantically-aware applications, or for adding semantics to information are still in the research phase and were designed for expert computer scientists who specialize in knowledge representation, artificial intelligence, and machine learning.
These tools require a large learning curve to work with and they don’t generally support large-scale applications — they were designed mainly to test theories and frameworks, not to actually apply them. But if the Semantic Web is ever going to become mainstream, it has to be made easier to apply — it has to be made more productive and accessible for ordinary software and content developers.
Fortunately, the tools problem is already on the verge of being solved. Companies such as my own venture, Radar Networks, are developing the next generation of tools for building Semantic Web applications and Semantic Web sites. These tools will hide most of the complexity, enabling ordinary mortals to build applications and content that leverage the power of semantics without needing PhDs in knowledge representation.
(2) The Ontology Problem:
The Semantic Web provides frameworks for defining systems of formally defined concepts called “ontologies”, that can then be used to connect information to context in an unambiguous way. Without ontologies, there really can be no semantics. The ontologies ARE the semantics, they define the meanings that are so essential for connecting information to context.
But there are still few widely used or standardized ontologies. And getting people to agree on common ontologies is not generally easy. Everyone has their own way of describing things, their own worldview, and let’s face it nobody wants to use somebody else’s worldview instead of their own. Furthermore, the world is very complex and to adequately describe all the knowledge that comprises what is thought of as “common sense” would require a very large ontology (and in fact, such an ontology exists — it’s called Cyc and it is so large and complex that only experts can really use it today).
Even to describe the knowledge of just a single vertical domain, such as medicine, is extremely challenging. To make matters worse, the tools for authoring ontologies are still very hard to use — one has to understand the
OWL language and difficult, buggy ontology authoring tools in order to use them.
Domain experts who are non-technical and not trained in formal reasoning or knowledge representation may find the process of designing ontologies frustrating using current tools. What is needed are commercial quality tools for building ontologies that hide the underlying complexity so that people can just pour their knowledge into them as easily as they speak. That’s still a ways off, but not far off. Perhaps ten years at the most.
Of course the difficulty of defining ontologies would be irrelevant if the necessary ontologies already existed. Perhaps experts could define them and then everyone else could just use them?
There are numerous ontologies already in existence, both on the general level as well as about specific verticals. However in my own opinion, having looked at many of them, I still haven’t found one that has the right balance of coverage of the necessary concepts most applications need, and accessibility and ease-of-use by non-experts. That kind of balance is a requirement for any ontology to really go mainstream.
Furthermore, regarding the present crop of ontologies, what is still lacking is standardization. Ontologists have not agreed on which ontologies to use. As a result it’s anybody’s guess which ontology to use when writing a semantic application and thus there is a high degree of ontology diversity today. Diversity is good, but too much diversity is chaos.
Applications that use different ontologies about the same things don’t automatically interoperate unless their ontologies have been integrated. This is similar to the problem of database integration in the enterprise. In order to interoperate, different applications that use different data schemas for records about the same things, have to be mapped to each other somehow — either at the application-level or the data-level. This mapping can be direct or through some form of middleware.
Ontologies can be used as a form of semantic middleware, enabling applications to be mapped at the data-level instead of the applications-level. Ontologies can also be used to map applications at the applications level, by making ontologies of Web services and capabilities, by the way. This is an area in which a lot of research is presently taking place.
The OWL language can express mappings between concepts in different ontologies. But if there are many ontologies, and many of them partially overlap, it is a non-trivial task to actually make the mappings between their concepts.
Even though concept A in ontology one and concept B in ontology two may have the same names, and even some of the same properties, in the context of the rest of the concepts in their respective ontologies they may imply very different meanings. So simply mapping them as equivalent on the basis of their names is not adequate, their connections to all the other concepts in their respective ontologies have to be considered as well. It quickly becomes complex.
vThere are some potential ways to automate the construction of mappings between ontologies however — but they are still experimental. Today, integrating ontologies requires the help of expert ontologists, and to be honest, I’m not sure even the experts have it figured out. It’s more of an art than a science at this point.

Image courtesy The Friedman Archives
All that is needed for mainstream adoption to begin is for a large body of mainstream content to become semantically tagged and accessible. This will cause whatever ontology is behind that content to become popular.
When developers see that there is significant content and traction around a particular ontology, they will use that ontology for their own applications about similar concepts, or at least they will do the work of mapping their own ontology to it, and in this way the world will converge in a Darwinian fashion around a few main ontologies over time.
These main ontologies will then be worth the time and effort necessary to integrate them on a semantic level, resulting in a cohesive Semantic Web. We may in fact see Darwinian natural selection take place not just at the ontology level, but at the level of pieces of ontologies. A certain ontology may do a good job of defining what a person is, while another may do a good job of defining what a company is. These definitions may be used for a lot of content, and gradually they will become common parts of an emergent meta-ontology comprised of the most-popular pieces from thousands of ontologies. This could be great or it could be a total mess. Nobody knows yet. It’s a subject for further research.
Since ontologies are so important, it is helpful to actually understand what an ontology really is, and what it looks like. An ontology is a system of formally defined related concepts. For example, a simple ontology is this set of statements such as this:
A human is a living thing.
A person is a human.
A person may have a first name.
A person may have a last name.
A person must have one and only one date of birth.
A person must have a gender.
A person may be socially related to another person.
A friendship is a kind of social relationship.
A romantic relationship is a kind of friendship.
A marriage is a kind of romantic relationship.
A person may be in a marriage with only one other person at a time.
A person may be employed by an employer.
An employer may be a person or an organization.
An organization is a group of people.
An organization may have a product or a service.
A company is a type organization.
We’ve just built a simple ontology about a few concepts: humans, living things, persons, names, social relationships, marriages, employment, employers, organizations, groups, products and services. Within this system of concepts there is particular logic, some constraints, and some structure. It may or may not correspond to your worldview, but it is a worldview that is unambiguously defined, can be communicated, and is internally logically consistent, and that is what is important.
The Semantic Web approach provides an open-standard language, OWL, for defining ontologies. OWL also provides for a way to define instances of ontologies. Instances are assertions within the worldview that a given ontology provides. In other words OWL provides a means to make statements that connect information to the ontology so that software can understand its meaning unambiguously. For example, below is a set of statements based on the above ontology:
There exists a person x.
Person x has a first name “Sue”
Person x  has a last name “Smith”
Person x has a full name “Sue Smith”
Sue Smith was born on June 1, 2005
Sue Smith has a gender: female
Sue Smith has a friend: Jane, who is another person.
Sue Smith is married to: Bob, another person.
Sue Smith is employed by Acme, Inc, a company.
Acme Inc. has a product, Widget 2.0.
The set of statements above, plus the ontology they are connected to, collectively comprise a knowledge base that, if represented formally in the OWL markup language, could be understood by any application that speaks OWL in the precise manner that it was intended to be understood.
The OWL language provides a way to markup any information such as a data record, an email message or a Web page with metadata in the form of statements that link particular words or phrases to concepts in the ontology. When software applications that understand OWL encounter the information they can then reference the ontology and figure out exactly what the information means — or at least what the ontology says that it means.
But something has to add these semantic metadata statements to the information — and if it doesn’t add them or adds the wrong ones, then software applications that look at the information will get the wrong idea. And this is another challenge — how will all this metadata get created and added into content? People certainly aren’t going to add it all by hand!
Fortunately there are many ways to make this easier. The best approach is to automate it using special software that goes through information, analyzes the meaning and adds semantic metadata automatically. This works today, but the software has to be trained or provided with rules and that takes some time. It also doesn’t scale cost-effectively to vast data-sets.
Alternatively, individuals can be provided with ways to add semantics themselves as they author information. When you post your resume in a semantically-aware job board, you could fill out a form about each of your past jobs, and the job board would connect that data to appropriate semantic concepts in an underlying employment ontology. As an end-user you would just fill out a form like you are used to doing; under-the-hood the job board would add the semantics for you.
Another approach is to leverage communities to get the semantics. We already see communities that are adding basic metadata “tags” to photos, news articles and maps. Already a few simple types of tags are being used pseudo-semantically: subject tags and geographical tags. These are primitive forms of semantic metadata. Although they are not expressed in OWL or connected to formal ontologies, they are at least semantically typed with prefixes or by being entered into fields or specific namespaces that define their types.

There may also be another solution to the problem of how to add semantics to content in the not too distant future. Once a suitable amount of content has been marked up with semantic metadata, it may be possible, through purely statistical forms of machine learning, for software to begin to learn how to do a pretty good job of marking up new content with semantic metadata.
For example, if the string “Nova Spivack” is often marked up with semantic metadata stating that it indicates a person, and not just any person but a specific person that is abstractly represented in a knowledge base somewhere, then when software applications encounter a new non-semantically enhanced document containing strings such as “Nova Spivack” or “Spivack, Nova” they can make a reasonably good guess that this indicates that same specific person, and they can add the necessary semantic metadata to that effect automatically.
As more and more semantic metadata is added to the Web and made accessible it constitutes a statistical training set that can be learned and generalized from. Although humans may need to jump-start the process with some manually semantic tagging, it might not be long before software could assist them and eventually do all the tagging for them. Only in special cases would software need to ask a human for assistance — for example when totally new terms or expressions were encountered for the first several times.
The technology for doing this learning already exists — and actually it’s not very different from how search engines like Google measure the community sentiment around web pages. Each time something is semantically tagged with a certain meaning that constitutes a “vote” for it having that meaning. The meaning that gets the most votes wins. It’s an elegant, Darwinian, emergent approach to learning how to automatically tag the Web.
One thing is certain, if communities were able to tag things with more types of tags, and these tags were connected to ontologies and knowledge bases, that would result in a lot of semantic metadata being added to content in a completely bottom-up, grassroots manner, and this in turn would enable this process to start to become automated or at least machine-augmented.
But making the user experience of semantic tagging easy (and immediately beneficial) enough that regular people will do it, is a challenge that has yet to be solved. However, it will be solved shortly. It has to be. And many companies and researchers know this and are working on it right now. This does have to be solved to get the process of jump-starting the Semantic Web started.
I believe that the Tools Problem — the lack of commercial grade tools for building semantic applications — is essentially solved already (although the products have not hit the market yet; they will within a few years at most).
The Ontology Problem is further from being solved. I think the way this problem will be solved is through a few “killer apps” that result in the building up of a large amount of content around particular ontologies within particular online services.
Where might we see this content initially arising? In my opinion it will most likely be within vertical communities of interest, communities of practice, and communities of purpose. Within such communities there is a need to create a common body of knowledge and to make that knowledge more accessible, connected and useful.
The Semantic Web can really improve the quality of knowledge and user-experience within these domains. Because they are communities, not just static content services, these organizations are driven by user-contributed content — users play a key role in building content and tagging it. We already see this process starting to take place in communities such as Flickr, del.icio.us, the Wikipedia and Digg. We know that communities of people do tag content, and consume tagged content, if it is easy and beneficial enough for to them to do so.
In the near future we may see miniature Semantic Webs arising around particular places, topics and subject areas, projects, and other organizations. Or perhaps, like almost every form of new media in recent times, we may see early adoption of the Semantic Web around online porn — what might be called “the sementic web”.
Whether you like it or not, it is a fact that pornography was one of the biggest drivers of early mainstream adoption of personal video technology, CD-ROMs, and also of the Internet and the Web. But I think it probably is not necessary this time around. While, I’m sure that the so-called “sementic web” could become better from the Semantic Web, it isn’t going to be the primary driver of adoption of the Semantic Web. That’s probably a good thing — the world can just skip over that phase of development and benefit from this technology with both hands so to speak.
In some ways one could think of the Semantic Web as “the world wide database” — it does for the meaning of data records what the Web did for the formatting documents. But that’s just the beginning. It actually turns documents into richer data records. It turns unstructured data into structured data. All data becomes structured data in fact. The structure is not merely defined structurally, but it is defined semantically.
In other words, it’s not merely that for example, a data record or document can be defined in such a way as to specify that it contains a certain field of data with a certain label at a certain location — it defines what that field of data actually means in an unambiguous, machine understandable way. If all you want is a Web of data, XML is good enough. But if you want to make that data interoperable and machine understandable then you need RDF and OWL — the Semantic Web.
Like any database, the Semantic Web, or rather the myriad mini-semantic-webs that will comprise it, have to overcome the challenge of data integration. Ontologies provide a better way to describe and map data, but the data still has to be described and mapped, and this does take some work. It’s not a magic bullet. The Semantic Web makes it easier to integrate data, but it doesn’t completely remove the data integration problem altogether. I think the eventual solution to this problem will combine technology and community folksonomy oriented approaches.


The Semantic Web will lower the cost of processing knowledge to the same degree as the printing press lowered the cost of distributing knowledge.

Let’s transition now and zoom out to see the bigger picture. The Semantic Web provides technologies for representing and sharing knowledge in new ways. In particular, it makes knowledge more accessible to software, and thus to other people.
Another way of saying this is that it liberates knowledge from particular human minds and organizations — it provides a way to make knowledge explicit, in a standardized format that any application can understand. This is quite significant. Let’s put this in historical perspective.
Before the invention of the printing press, there were two ways to spread knowledge — one was orally, the other was in some symbolic form such as art or written manuscripts. The oral transmission of knowledge had limited range and a high error-rate, and the only way to learn something was to meet someone who knew it and get them to tell you.
The other option, symbolic communication through art and writing, provided a means to communicate knowledge independently of particular people — but it was only feasible to produce a few copies of any given artwork or manuscript because they had to be copied by hand. So the transmission of knowledge was limited to small groups or at least small audiences. Basically, the only way to get access to this knowledge was to be one of the lucky few who could acquire one of its rare physical copies.
The invention of the printing press changed this — for the first time knowledge could be rapidly and cost-effectively mass-produced and mass-distributed. Printing made it possible to share knowledge with ever-larger audiences. This enabled a huge transformation for human knowledge, society, government, technology — really every area of human life was transformed by this innovation.
The World Wide Web made the replication and distribution of knowledge even easier. With the Web you don’t even have to physically print or distribute knowledge anymore, the cost of distribution is effectively zero, and everyone has instant access to everything from anywhere, anytime. That’s a lot better than having to lug around a stack of physical books.
Everyone potentially has whatever knowledge they need with no physical barriers. This has been another huge transformation for humanity — and it has affected every area of human life. Like the printing press, the Web fundamentally changed the economics of knowledge.
The Semantic Web is the next big step in this process — it will make all the knowledge of the human race accessible to software. For the first time, non-human beings (software applications) will be able to start working with human knowledge to do things (for humans) on their own. This is a big leap — a leap like the emergence of a new species, or the symbiosis of two existing species into a new form of life.
The printing press and the Web changed the economics of replicating, distributing and accessing knowledge. The Semantic Web changes the economics of processing knowledge. Unlike the printing press and the Web, the Semantic Web enables knowledge to be processed by non-human things.
In other words, humans don’t have to do all the thinking on their own, they can be assisted by software. Of course we humans have to at least first create the software (until we someday learn to create software that is smart enough to create software too), and we have to create the ontologies necessary for the software to actually understand anything (until we learn to create software that is smart enough to create ontologies too), and we have to add the semantic metadata to our content in various ways (until our software is smart enough to do this for us, which it almost is already).
But once we do the initial work of making the ontologies and software, and adding semantic metadata, the system starts to pick up speed on its own, and over time the amount of work we humans have to do to make it all function decreases. Eventually, once the system has encoded enough knowledge and intelligence, it starts to function without needing much help, and when it does need our help, it will simply ask us and learn from our answers.
This may sound like science fiction today, but in fact a lot of this is already built and working in the lab. The big hurdle is figuring out how to get this technology to mass-market. That is probably as hard as inventing the technology in the first place. But I’m confident that someone will solve it eventually.
Once this happens the economics of processing knowledge will truly be different than it is today. Instead of needing an actual real-live expert, the knowledge of that expert will be accessible to software that can act as their proxy — and anyone will be able to access this virtual expert, anywhere, anytime. It will be like the Web — but instead of just information being accessible, the combined knowledge and expertise of all of humanity will also be accessible, and not just to people but also to software applications.
The Semantic Web literally enables humans to share their knowledge with each other and with machines. It enables the virtualization of human knowledge and intelligence. With respect to machines, in doing this, it will lend machines “minds” in a certain sense — namely in that they will at least be able to correctly interpret the meaning of information and replicate the expertise of experts.
But will these machine-minds be conscious? Will they be aware of the meanings they interpret, or will they just be automatons that are simply following instructions without any awareness of the meanings they are processing?
I doubt that software will ever be conscious, because from what I can tell consciousness — or what might be called the sentient awareness of awareness itself as well as other things that are sensed — is an immaterial phenomena that is as fundamental as space, time and energy — or perhaps even more fundamental. But this is just my personal opinion after having searched for consciousness through every means possible for decades. It just cannot be found to be something, yet it is definitely and undeniably taking place.
Consciousness can be exemplified through the analogy of space (but unlike space, consciousness has this property of being aware, it’s not a mere lifeless void). We all agree space is there, but nobody can actually point to it somewhere, and nobody can synthesize space. Space is immaterial and fundamental. It is primordial. So is electricity. Nobody really knows what electricity is ultimately, but if you build the right kind of circuit you can channel it and we’vee learned a lot about how to do that.
Perhaps we may figure out how to channel consciousness like we channel electricity with some sort of synthetic device someday, but I think that is highly unlikely. I think if you really want to create consciousness it’s much easier and more effective to just have children. That’s something ordinary mortals can do today with the technology they were born with.
Of course when you have children you don’t really “create” their consciousness, it seems to be there on its own. We donn’t really know what it is or where it comes from, or when it arises there. We know very little about consciousness today. Considering that it is the most fundamental human experience of all, it is actually surprising how little we know about it!
In any case, until we truly delve far more deeply into the nature of the mind, consciousness will be barely understood or recognized, let alone explained or synthesized by anyone. In many eastern civilizations there are multi-thousand year traditions that focus quite precisely on the nature of consciousness. The major religions have all universally concluded that consciousness is beyond the reach of science, beyond the reach of concepts, beyond the mind entirely. All those smart people analyzing consciousness for so long, and with such precision, and so many methods of inquiry, may have a point worth listening to.
Whether or not machines will ever actually “know” or be capable of being conscious of that meaning or expertise is a big debate, but at least we can all agree that they will be able to interpret the meaning of information and rules if given the right instructions. Without having to be conscious, software will be able to process semantics quite well — this has already been proven. It’s working today.
While consciousness is and may always be a mystery that we cannot synthesize — the ability for software to follow instructions is an established fact. In its most reduced form, the Semantic Web just makes it possible to provide richer kinds of instructions. There’s no magic to it. Just a lot of details. In fact, to play on a famous line, “it’s semantics all the way down”.
The Semantic Web does not require that we make conscious software. It just provides a way to make slightly more intelligent software. There’s a big difference. Intelligence is simply a form of information processing, for the most part. It does not require consciousness — the actual awareness of what is going on — which is something else altogether.
While highly intelligent software may need to sense its environment and its own internal state and reason about these, it does not actually have to be conscious to do this. These operations are for the most part simple procedures applied vast numbers of time and in complex patterns. Nowhere in them is there any consciousness nor does consciousness suddenly emerge when suitable levels of complexity are reached.
Consciousness is something quite special and mysterious. And fortunately for humans, it is not necessary for the creation of more intelligent software, nor is it a byproduct of the creation of more intelligent software, in my opinion.

So the real point of the Semantic Web is that it enables the Web to become more intelligent. At first this may seem like a rather outlandish statement, but in fact the Web is already becoming intelligent, even without the Semantic Web.
Although the intelligence of the Web is not very evident at first glance, nonetheless it can be found if you look for it. This intelligence doesn’t exist across the entire Web yet, it only exists in islands that are few and far between compared to the vast amount of information on the Web as a whole. But these islands are growing, and more are appearing every year, and they are starting to connect together. And as this happens the collective intelligence of the Web is increasing.
Perhaps the premier example of an “island of intelligence” is the Wikipedia, but there are many others: the
Open Directory, portals such as Yahoo and Google, vertical content providers such as CNET and WebMD, commerce communities such as Craigslist and Amazon, content oriented communities such as LiveJournal, Slashdot, Flickr and Digg and of course the millions of discussion boards scattered around the Web, and social communities such as MySpace and Facebook. There are also large numbers of private islands of intelligence on the Web within enterprises — for example the many online knowledge and collaboration portals that exist within businesses, non-profits, and governments.
What makes these islands “intelligent” is that they are places where people (and sometimes applications as well) are able to interact with each other to help grow and evolve collections of knowledge. When you look at them close-up they appear to be just like any other Web site, but when you look at what they are doing as a whole — these services are
thinking. They are learning, self-organizing, sensing their environments, interpreting, reasoning, understanding, introspecting, and building knowledge. These are the activities of minds, of intelligent systems.
The intelligence of a system such as the Wikipedia exists on several levels — the individuals who author and edit it are intelligent, the groups that help to manage it are intelligent, and the community as a whole — which is constantly growing, changing, and learning — is intelligent.
Flickr and Digg also exhibit intelligence. Flickr’s growing system of tags is the beginnings of something resembling a collective visual sense organ on the Web. Images are perceived, stored, interpreted, and connected to concepts and other images. This is what the human visual system does. Similarly, Digg is a community that collectively detects, focuses attention on, and interprets current news. It’s not unlike a primitive collective analogue to the human facility for situational awareness.
There are many other examples of collective intelligence emerging on the Web. The Semantic Web will add one more form of intelligent actor to the mix — intelligent applications. In the future, after the Wikipedia is connected to the Semantic Web, as well as humans, it will be authored and edited by smart applications that constantly look for new information, new connections, and new inferences to add to it.
Although the knowledge on the Web today is still mostly organized within different islands of intelligence, these islands are starting to reach out and connect together. They are forming trade-routes, connecting their economies, and learning each other’s languages and cultures.
The next step will be for these islands of knowledge to begin to share not just content and services, but also their knowledge — what they know about their content and services. The Semantic Web will make this possible, by providing an open format for the representation and exchange of knowledge and expertise.
When applications integrate their content using the Semantic Web they will also be able to integrate their context, their knowledge — this will make the content much more useful and the integration much deeper.
For example, when an application imports photos from another application it will also be able to import semantic metadata about the meaning and connections of those photos. Everything that the community and application know about the photos in the service that provides the content (the photos) can be shared with the service that receives the content.
Better yet, there will be no need for custom application integration in order for this to happen: as long as both services conform to the open standards of the Semantic Web the knowledge is instantly portable and reusable.
Today much of the real value of the Web (and in the world) is still locked away in the minds of individuals, the cultures of groups and organizations, and application-specific data-silos. The emerging Semantic Web will begin to unlock the intelligence in these silos by making the knowledge and expertise they represent more accessible and understandable.
It will free knowledge and expertise from the narrow confines of individual minds, groups and organizations, and applications, and make them not only more interoperable, but more portable. It will be possible for example for a person or an application to share everything they know about a subject of interest as easily as we share documents today. In essence the Semantic Web provides a common language (or at least a common set of languages) for sharing knowledge and intelligence as easily as we share content today.
The Semantic Web also provides standards for searching and reasoning more intelligently. The SPARQL query language enables any application to ask for knowledge from any other application that speaks SPARQL. Instead of mere keyword search, this enables semantic search. Applications can search for specific types of things that have particular attributes and relationships to other things.
In addition, standards such as SWRL provide formalisms for representing and sharing axioms, or rules, as well. Rules are a particular kind of knowledge — and there is a lot of it to represent and share, for example procedural knowledge, and logical structures about the world. An ontology provides a means to describe the basic entities, their attributes and relations, but rules enable you to also make logical assertions and inferences about them. Without going into a lot of detail about rules and how they work here, the important point to realize is that they are also included in the framework. All forms of knowledge can be represented by the Semantic Web.
So far in this article, I’ve spent a lot of time talking about plumbing — the pipes, fluids, valves, fixtures, specifications and tools of the Semantic Web. I’ve also spent some time on illustrations of how it might be useful in the very near future to individuals, groups and organizations. But where is it heading after this? What is the long-term potential of this and what might it mean for the human race on a historical time-scale?
For those of you who would prefer not to speculate, stop reading here. For the rest of you, I believe that the true significance of the Semantic Web, on a long-term time scale is that it provides an infrastructure that will enable the evolution of increasingly sophisticated forms of collective intelligence.
Ultimately this will result in the Web itself becoming more and more intelligent, until one day the entire human species together with all of its software and knowledge will function as something like a single worldwide distributed mind — a global mind.
Just the like the mind of a single human individual, the global mind will be very chaotic, yet out of that chaos will emerge cohesive patterns of thought and decision. Just like in an individual human mind, there will be feedback between different levels of order — from individuals to groups to systems of groups and back down from systems of groups to groups to individuals. Because of these feedback loops the system will adapt to its environment, and to its own internal state.
The coming global mind will collectively exhibit forms of cognition and behavior that are the signs of higher-forms of intelligence. It will form and react to concepts about its “self” — just like an individual human mind. It will learn and introspect and explore the universe. The thoughts it thinks may sometimes be too big for any one person to understand or even recognize them — they will be comprised of shifting patterns of millions of pieces of knowledge.
Every person on the Internet will be a part of the global mind. And collectively they will function as its consciousness. I do not believe some new form of consciousness will suddenly emerge when the Web passes some threshold of complexity. I believe that humanity IS the consciousness of the Web and until and unless we ever find a way to connect other lifeforms to the Web, or we build conscious machines, humans will be the only form of consciousness of the Web.
When I say that humans will function as the consciousness of the Web I mean that we will be the things in the system that know. The knowledge of the Semantic Web is what is known, but what knows that knowledge has to be something other than knowledge. A thought is knowledge, but what knows that thought is not knowledge, it is consciousness, whatever that is. We can figure out how to enable machines to represent and use knowledge, but we don’t know how to make them conscious, and we don’t have to. Because we are already conscious.
As we’ve discussed earlier in this article, we don’t need conscious machines, we just need more intelligent machines. Intelligence — at least basic forms of it — does not require consciousness. It may be the case that the very highest forms of intelligence require or are capable of consciousness.
This may mean that software will never achieve the highest levels of intelligence and probably guarantees that humans (and other conscious things) will always play a special role in the world; a role that no computer system will be able to compete with. We provide the consciousness to the system. There may be all sorts of other intelligent, non-conscious software applications and communities on the Web; in fact there already are, with varying degrees of intelligence. But individual humans, and groups of humans, will be the only consciousness on the Web.
Although the software of the Semantic Web will not be conscious we can say that system as a whole contains or is conscious to the extent that human consciousnesses are part of it. And like most conscious entities, it may also start to be self-conscious.
If the Web ever becomes a global mind as I am predicting, will it have a “self”? Will there be a part of the Web that functions as its central self-representation? Perhaps someone will build something like that someday, or perhaps it will evolve. Perhaps it will function by collecting reports from applications and people in real-time — a giant collective zeitgeist.
In the early days of the Web portals such as Yahoo! provided this function — they were almost real-time maps of the Web and what was happening. Today making such a map is nearly impossible, but services such as Google Zeitgeist at least attempt to provide approximations of it. Perhaps through random sampling it can be done on a broader scale.
My guess is that the global mind will need a self-representation at some point. All forms of higher intelligence seem to have one. It’s necessary for understanding, learning and planning. It may evolve at first as a bunch of competing self-representations within particular services or subsystems within the collective. Eventually they will converge or at least narrow down to just a few major perspectives. There may also be millions of minor perspectives that can be drilled down into for particular viewpoints from these top-level “portals”.
The collective self, will function much like the individual self — as a mirror of sorts. Its function is simply to reflect. As soon as it exists the entire system will make a shift to a greater form of intelligence — because for the first time it will be able to see itself, to measure itself, as a whole. It is at this phase transition when the first truly global collective self-mirroring function evolves, that we can say that the transition from a bunch of cooperating intelligent parts to a new intelligent whole in its own right has taken place.
I think that the collective self, even if it converges on a few major perspectives that group and summarize millions of minor perspectives, will be community-driven and highly decentralized. At least I hope so — because the self-concept is the most important part of any mind and it should be designed in a way that protects it from being manipulated for nefarious ends. At least I hope that is how it is designed.

We hope the global brain will not have the collective equivalent
of psycho-social disorders.

On the other hand, there are times when a little bit of adjustment or guidance is warranted — just as in the case of an individual mind, the collective self doesn’t merely reflect, it effectively guides the interpretation of the past and present, and planning for the future.
One way to change the direction of the collective mind, is to change what is appearing in the mirror of the collective self. This is a form of programming on a vast scale — When this programming is dishonest or used for negative purposes it is called “propaganda”, but there are cases where it can be done for beneficial purposes as well. An example of this today is public service advertising and educational public television programming. All forms of mass-media today are in fact collective social programming. When you realize this it is not surprising that our present culture is violent and messed up — just look at our mass-media!
In terms of the global mind, ideally one would hope that it would be able to learn and improve over time. One would hope that it would not have the collective equivalent of psycho-social disorders.
To facilitate this, just like any form of higher intelligence, it may need to be taught, and even parented a bit. It also may need a form of therapy now and then. These functions could be provided by the people who participate in it. Again, I believe that humans serve a vital and irreplaceable role in this process.
Now how is this all going to unfold? I believe that there are a number of key evolutionary steps that the Semantic Web will go through as the Web evolves towards a true global mind:
1. Representing individual knowledge. The first step is to make individuals’ knowledge accessible to themselves. As individuals become inundated with increasing amounts of information, they will need better ways of managing it, keeping track of it, and re-using it. They will (or already do) need “personal knowledge management”.
2. Connecting individual knowledge. Next, once individual knowledge is represented, it becomes possible to start connecting it and sharing it across individuals. This stage could be called “interpersonal knowledge management”.
3. Representing group knowledge. Groups of individuals also need ways of collectively representing their knowledge, making sense of it, and growing it over time. Wikis and community portals are just the beginning. The Semantic Web will take these “group minds” to the next level — it will make the collective knowledge of groups far richer and more re-usable.
4. Connecting group knowledge. This step is analogous to connecting individual knowledge. Here, groups become able to connect their knowledge together to form larger collectives, and it becomes possible to more easily access and share knowledge between different groups in very different areas of interest.
5. Representing the knowledge of the entire web. This stage — what might be called “the global mind” — is still in the distant future, but at this point in the future we will begin to be able to view, search, and navigate the knowledge of the entire web as a whole. The distinction here is that instead of a collection of interoperating but separate intelligent applications, individuals and groups, the entire web itself will begin to function as one cohesive intelligent system. The crucial step that enables this to happen is the formation of a collective self-representation. This enables the system to see itself as a whole for the first time.
I believe the global mind will be organized mainly in the form of bottom-up and lateral, distributed emergent computation and community — but it will be facilitated by certain key top-down services that help to organize and make sense of it as a whole. I think this future Web will be highly distributed, but will have certain large services within it as well — much like the human brain itself, which is organized into functional sub-systems for processes like vision, hearing, language, planning, memory, learning, etc.
As the Web gets more complex there will come a day when nobody understands it anymore — after that point we will probably learn more about how the Web is organized by learning about the human mind and brain — they will be quite similar in my opinion. Likewise we will probably learn a tremendous amount about the functioning of the human brain and mind by observing how the Web functions, grows and evolves over time, because they really are quite similar in at least an abstract sense.
The internet and its software and content is like a brain, and the state of its software and the content is like its mind. The people on the Internet are like its consciousness. Although these are just analogies, they are actually useful, at least in helping us to envision and understand this complex system.
As the field of general systems theory has shown us in the past, systems at very different levels of scale tend to share the same basic characteristics and obey the same basic laws of behavior. Not only that, but evolution tends to converge on similar solutions for similar problems. So these analogies may be more than just rough approximations, they may be quite accurate in fact.
The future global brain will require tremendous computing and storage resources — far beyond even what Google provides today. Fortunately as Moore’s Law advances the cost of computing and storage will eventually be low enough to do this cost-effectively.
However even with much cheaper and more powerful computing resources it will still have to be a distributed system. I doubt that there will be any central node because quite simply no central solution will be able to keep up with all the distributed change taking place. Highly distributed problems require distributed solutions and that is probably what will eventually emerge on the future Web.
Someday perhaps it will be more like a peer-to-peer network, comprised of applications and people who function sort of like the neurons in the human brain. Perhaps they will be connected and organized by higher-level super-peers or super-nodes which bring things together, make sense of what is going on and coordinate mass collective activities.
But even these higher-level services will probably have to be highly distributed as well. It really will be difficult to draw boundaries between parts of this system, they will all be connected as an integral whole.
In fact it may look very much like a grid computing architecture — in which all the services are dynamically distributed across all the nodes such that at any one time any node might be working on a variety of tasks for different services. My guess is that because this is the simplest, most fault-tolerant, and most efficient way to do mass computation, it is probably what will evolve here on Earth.

Compared to the global mind, we are an early form of hominid.
Where we are today in this evolutionary process is perhaps equivalent to the rise of early forms of hominids. Perhaps Austrolapithecus or Cro-Magnon, or maybe the first Homo Sapiens. Compared to early man, the global mind is like the rise of 21st century mega-cities. A lot of evolution has to happen to get there. But it probably will happen, unless humanity self-destructs first, which I sincerely hope we somehow manage to avoid. And this brings me to a final point. This vision of the future global mind is highly technological; however I don’t think we’ll ever accomplish it without a new focus on ecology.
Ecology probably conjures up images of hippies and biologists, or maybe hippies who are biologists, or at least organic farmers, for most people, but in fact it is really the science of living systems and how they work. And any system that includes living things is a living system.
This means that the Web is a living system and the global mind will be a living system too. As a living system, the Web is an ecosystem and is also connected to other ecosystems. In short, ecology is absolutely essential to making sense of the Web, let alone helping to grow and evolve it.
In many ways the Semantic Web and the collective minds, and the global mind, that it enables, can be seen as an ecosystem of people, applications, information and knowledge. This ecosystem is very complex, much like natural ecosystems in the physical world. An ecosystem isn’t built, it’s grown, and evolved.
And similarly the Semantic Web, and the coming global mind, will not really be built, they will be grown and evolved. The people and organizations that end up playing a leading role in this process will be the ones that understand and adapt to the ecology most effectively.
In my opinion ecology is going to be the most important science and discipline of the 21st century — it is the science of healthy systems. What nature teaches us about complex systems can be applied to every kind of system — and especially the systems we are evolving on the Web. In order to ever have a hope of evolving a global mind, and all the wonderful levels of species-level collective intelligence that it will enable, we have to not destroy the planet before we get there. Ecology is the science that can save us, not the Semantic Web (although perhaps by improving collective intelligence, it can help).
Ecology is essentially the science of community — whether biological, technological or social. And community is a key part of the Semantic Web at every level: communities of software, communities of people, and communities of groups. In the end the global mind is the ultimate human community. It is the reward we get for finally learning how to live together in peace and balance with our environment.
The point of this discussion of the relevance of ecology to the future of the Web, and my vision for the global mind, is that I think that it is clear that if the global mind ever emerges it will not be in a world that is anything like what we might imagine. It won’t be like the Borg in Star Trek, it won’t be like living inside of a machine. Humans won’t be relegated to the roles of slaves or drones. Robots won’t be doing all the work. The entire world won’t be coated with silicon. We won’t all live in a virtual reality. It won’t be one of these technological dystopias.
In fact, I think the global mind can only come to pass in a much greener, more organic, healthier, more balanced and sustainable world. Because it will take a long time for the global mind to emerge, if humanity doesn’t figure out how to create that sort of a world, it will wipe itself out sooner or later, but certainly long before the global mind really happens. Not only that, but the global mind will be smart by definition, and hopefully this intelligence will extend to helping humanity manage its resources, civilizations and relationships to the natural environment.
The global mind also needs a global body so to speak. It’s not going to be an isolated homunculus floating in a vat of liquid that replaces the physical world! It will be a smart environment that ubiquitously integrates with our physical world. We won’t have to sit in front of computers or deliberately logon to the network to interact with the global mind. It will be everywhere.
The global mind will be physically integrated into furniture, houses, vehicles, devices, artworks, and even the natural environment. It will sense the state of the world and different ecosystems in real-time and alert humans and applications to emerging threats. It will also be able to allocate resources intelligently to compensate for natural disasters, storms, and environmental damage — much in the way that the air traffic control systems allocates and manages airplane traffic. It won’t do it all on its own, humans and organizations will be a key part of the process.
Someday the global mind may even be physically integrated into our bodies and brains, even down the level of our DNA. It may in fact learn how to cure diseases and improve the design of the human body, extending our lives, sensory capabilities, and cognitive abilities.
We may be able to interact with it by thought alone. At that point it will become indistinguishable from a limited from of omniscience, and everyone may have access to it. Although it will only extend to wherever humanity has a presence in the universe, within that boundary it will know everything there is to know, and everyone will be able to know any of it they are interested in.

Let’s enable a better world! Image courtesy Dark Project Studios.
By enabling greater forms of collective intelligence to emerge we really are helping to make a better world, a world that learns and hopefully understands itself well enough to find a way to survive. We’re building something that someday will be wonderful — far greater than any of us can imagine. We’re helping to make the species and the whole planet more intelligent. We’re building the tools for the future of human community. And that future community, if it ever arrives, will be better, more self-aware, more sustainable than the one we live in today.
I should also mention that knowledge is power, and power can be used for good or evil. The Semantic Web makes knowledge more accessible. This puts more power in the hands of the many, not just the few. As long as we stick to this vision — we stick to making knowledge open and accessible, using open standards, in as distributed a fashion as we can devise, then the potential power of the Semantic Web will be protected against being coopted or controlled by the few at the expense of the many. This is where technologists really have to be socially responsible when making development decisions. It’s important that we build a more open world, not a less open world. It’s important that we build a world where knowledge, integration and unification are balanced with respect for privacy, individuality, diversity and freedom of opinion.
But I am not particularly worried that the Semantic Web and the future global mind will be the ultimate evil — I don’t think it is likely that we will end up with a system of total control dominated by evil masterminds with powerful Semantic Web computer systems to do their dirty work. Statistically speaking, criminal empires don’t last very long because they are run by criminals who tend to be very short-sighted and who also surround themselves with other criminals who eventually unseat them, or they self-destruct. It’s possible that the Semantic Web, like any other technology, may be used by the bad guys to spy on citizens, manipulate the world, and do evil things. But only in the short-term.
In the long-term either our civilization will get tired of endless successions of criminal empires and realize that the only way to actually survive as a species is to invent a form of government that is immune to being taken over by evil people and organizations, or it will self-destruct. Either way, that is a hurdle we have to cross before the global mind that I envision can ever come about. Many civilizations came before ours, and it is likely that ours will not be the last one on this planet. It may in fact be the case that a different form of civilization is necessary for the global mind to emerge, and is the natural byproduct of the emergence of the global mind.
We know that the global mind cannot emerge anytime soon, and therefore, if it ever emerges then by definition it must be in the context of a civilization that has learned to become sustainable. A long-term sustainable civilization is a non-evil civilization. And that is why I think it is a safe bet to be so optimistic about the long-term future of this trend.
Entrepreneurs See a Web Guided by Common Sense by John Markoff, New York Times, November 12, 2006.

Read Full Post »

Trends in the Living Networks

Ross Dawson’s Trends in the Living Networks blog offers high-level commentary on developments in our intensely networked world, and how it is coming to life. The blog is primarily intended for a general business audience, in identifying critical technology, social, and business trends and their implications.

January 21, 2008

See our latest Trend Map! What to expect in 2008 and beyond….

Nowandnext.com and Future Exploration Network have once again collaborated to create a trend map for 2008 and beyond.

Our Trend Map for 2007+ had a major impact, with over 40,000 downloads, fantastic feedback (“The World’s Best Trend Map. Ever.” “I got shivers” “Amazing” “Fascinating” “Magnifique” etc. etc.), and inspired several other trend maps including Information Architects’ first map of web trends.

While last year’s map was based on the London tube map, the 2008 map is derived from Shanghai’s underground routes. Limited to just five lines, the map uncovers key trends across Society, Politics, Demographics, Economy, and Technology.

Click on the map below to get the full pdf.


Trends mentioned in the map include:

Continue reading “See our latest Trend Map! What to expect in 2008 and beyond….”

Posted by Ross Dawson at 2:06 AM | Permalink | Comments (1)




Read Full Post »

« From the Metaweb to the Semantic Web: A Roadmap | Main | “Memes” are the units of the Metaweb: Microcontent by Another Name »

December 11, 2003

The Metaweb: Beyond Weblogs

The Metaweb is not just the set of all Weblog posts, it is much more than that. As much as I love to blog I think many old-timers would have us view the entire Net through “blog colored glasses.” But Weblog postings are just one kind of microcontent. There will be many others.

The Metaweb is the set of all microcontent on the Web. See my previous articles for a definition of microcontent.The structure of a Weblog posting is typically defined by RSS (and soon Atom) — it comprises of a few metatags and some text comprising a headline, a date, an author, a summary, and a few other sections of content depending on what flavor of RSS one uses. This structure is sufficient for describing basic Web resources such as a blog posting, a news article, a web page perhaps. But it is woefully inadequate for describing other types of things such as events, people, places, products. RSS and its descendants will be used for much more than just syndicating weblog postings and web site headlines — they will be used to syndicate event listings, product listings, classified ads, reviews, pictures, audio tracks, contact records, and many other types of information. But to do that RSS must be extended to provide metadata for those types of things.

To be fair various versions of RSS provide for extensions to be created — but what good is creating and using a bunch of custom metatags if nothing out there can understand them? The problem with all these specs is that while they allow for extensibility they provide little in the means of interoperability of extensions. Furthermore the current crop of RSS readers provide little or no support for doing anything with extended metadata. This is where I think the Semantic Web becomes increasingly important. It provides a means to rigorously define systems of metatags (using ontologies) such that they can be formally understood by any software that is enabled to use ontologies. So if I ship you some microcontent and it contains custom metadata, your software can see what ontology I am using and via that ontology it can correctly interpret my unique metatags. In any case, whether or not you recognize the value of using ontologies, the important poinit is that the Metaweb is going to contain innumerable varieties of microcontent about all sorts of things.

While weblog postings and resulting conversations will continue to be a big part of the early Metaweb, much of the future Metaweb will be comprised of non-conversational microcontent such as database records of one type or another — most likely product catalog listings, classified ads, content abstracts, calendar events, and the like. The key here is that the concept of “microcontent” is very far reaching and ultimately much more significant than any particular application of it such as blogging. While blogging is certainly the driver of early Metaweb adoption, in the long run it will probably be corporate microcontent that creates the big business models and long-term growth and adoption of the Metaweb by the mainstream.

I should also point out that the concept of “syndication” is not essential to the Metaweb. With all due respect to my friend Doc Searls, the fact that microcontent can be syndicated is not the key contribution of microcontent, although it is useful. The greatest benefit of Microcontent will ultimately be the widespread use of metadata to frame content. That’s why I use “Meta” in the term “Metaweb.” As more metadata is added to the Web, the data becomes “smarter” so applications don’t have to work so hard. This will result in better searches, better filtering, more targeted publishing and marketing, more productive information management, easier knowledge discovery, better decisionmaking and collaboration, and many other benefits.

There will be much metaweb content that will not necessarily be syndicated — instead such microcontent will reside in databases, on desktops and enterprise applications, and will be embedded in Web sites. By virtue of its metadata applications such as search engines, web scrapers, RSS readers, Web browsers, intelligent agents, etc. will be able to discover this microcontent and recognize its structure. Whether or not the microcontent is pulled down automatically (e.g. “syndicated”) to an RSS reader or to another site is not so important — what is important is that the metadata exists and is useful.

At present the metaweb hasn’t “crossed the chasm,” but within 3 to 5 years it will. And then we will see new uses of the microcontent paradigm spreading virally through businesses, universities, communities, governments — like wildfire — like Web1, the original Web from which it was born.

At my company, Radar Networks, we are developing a new platform for the Metaweb that will provide the most powerful toolset available for publishing and subscribing to microcontent. We are still in stealth mode.

Read Full Post »

My Photo

« A Rip in Earth’s Magnetic Field | Main | Looking for Numbers to Chart the Growth of RSS… »

December 04, 2003

The Birth of “The Metaweb” — The Next Big Thing — What We are All Really Building

Originally developed at Netscape, a new technology called RSS has risen from the dead to ignite the next-evolution of the Net. RSS represents the first step in a major new paradigm shift — the birth of “The Metaweb.” The Metaweb is the next evolution of the Web — a new layer of the Web in fact — based on “microcontent.” Microcontent is a new way to publish content that is more granular, modular and portable than traditional content such as files, Web pages, data records, etc.On the existing Web, information is typically published in large chunks — “sites” comprised of “pages.” In the coming microcontent-driven Metaweb, information will be published in discrete, semantically defined “postings” that can represent an entire site, a page, a part of a page, or an individual idea, picture, file, message, fact, opinion, note, data record, or comment.Metaweb postings can be hosted like Web pages in particular places and/or they can be shipped around the Net using RSS in a publish-subscribe manner. Webloggers for example create microcontent every time they post to their blogs. Each blog posting is a piece of microcontent. End-users can subscribe to get particular pieces of microcontent they are interested in by signing up to track “RSS channels” using “RSS Readers” that poll those channels periodically for new pieces of microcontent.

RSS resembles traditional “publish and subscribe” except that it scales to the entire Internet and is based on new XML open standards. Unlike “push technology” RSS and the microcontent model is based instead on “pull” — just like the Web itself — RSS Readers periodically poll sources for new RSS content and pull it down instead of having it pushed at them. Thus, unlike push technology, with RSS the control is in the hands of opt-in end-users. These differences, combined with RSS’s use of open HTTP protocols and XML/RDF formats have led to rapid adoption and viral spread of RSS technologies — principally within the Weblogging and information services communities. But that’s about to change.RSS is poised to become The Next Big Thing. There are many reasons for this — for one thing, e-mail is no longer useful as a content distribution, alerting and marketing medium. E-Mail’s rapidly eroding signal-to-noise ratio is leading content providers and end-users to seek alternative, more mutually-effective avenues for interacting with one another. Another force that is driving RSS adoption is the rise of Weblogging.My projections indicate that within 5 years almost every Weblog will provide an RSS channel of its content. In coming years a large percentage of consumers and professionals are expected to begin blogging — Weblogs are the new homepages; everyone should have one.

Within 5 years, if RSS grows as I expect, we will see it supplant e-mail as the primary alerting and marketing channel for “B2C” communications. To put it simply, businesses and their customers both benefit from interacting via RSS instead of e-mail for “1-way” interactions such as content publishing, notifications, etc. Based on that, I predict that every medium to large corporate Web site and every major publication and wire service, as well as an increasing number of enterprise applications and services will publish and subscribe to numerous RSS channels. Already we see the beginning of this with numerous major organizations embracing RSS from IBM, Microsoft and Sun to The New York Times, ABC News and WIRED to name a few examples.

So, 30 million bloggers at 1 feed each + 2 milllion small, medium and large businesses at an average of 20 feeds each + 2 million web sites and information services providers at an average of 10 feeds each + 10 major portals and online services at an average of 1,000,000 feeds each + 100 million desktop and enterprise applications producing 1 feed each …. you can see where this is headed. To be conservative let’s assume that the numbers turn out to be less than what I project — that is still 50 million to 100 million feeds online within 5 years. And that’s a growth curve that looks a lot like the first wave of the Web. Just as everyone “had to have” an e-mail account and a Web page, they will also soon need and want to have an RSS reader and their own RSS channel. That’s a big opportunity.

But RSS is just the first step in the evolution of the Metaweb. The next step will be the Semantic Web. RSS begins the process of getting end-users and content providers to use metadata. The next step is to make that metadata more interoperable, more understandable, more useful. This takes place using ontologies and emerging tools for working with “semantic metadata” — metadata for which formally defined semantics exists. Just providing metadata is not enough — the meaning of that metadata has to be defined somewhere in a formal, rigorous, manner that computers can understand automatically. The Semantic Web transforms data and metadata from “dumb data” to “smart data.” When I say “smart data” I mean data that carries increased amounts of information about its own meaning, structure, purpose, context, policies, etc. The data is “smart” because the knowledge about the data moves with the data, instead of being locked in an application. So the Semantic Web is a web of “smart data” — a Web of semantically defined metadata. The Semantic Web is already evolving naturally from the emerging confluence of Blogs, Wikis, RSS feeds, RDF tools, ontology languages such as OWL, rich ontologies, inferencing engines, triplestores, and a growing range of new tools and services for working with metadata. But the key is that we don’t have to wait for the Semantic Web for metadata to be useful. The Metaweb is already happening. RSS is already useful and it’s happening now.

As I write this on the leading edge of 2004 — a little more than ten years after the Web began — I am aware that we are witnessing the birth of the next generation of the Net. I remember watching the birth of the HTML-Web as a technology analyst/editor at Individual, Inc in the early 90’s. My job was to manage a collection of intelligent agents that scanned hundreds of newswires and content archives to produce filtered strategic newsfeeds for major customers. My beat was “emerging technologies” — every night I had to Q-A the output of my agents by reading around 1400 articles and press releases about new technologies in a 4 hour period.

It was in the midst of that firehose of information that I noticed the birth of HTML and HTTP, the rise of early hypertext systems, the first browsers — and I realized that “something big” was afoot. At the beginning the pattern wasn’t evident from reading individual articles — only by reading 1400 articles a night could one see the early meme-signatures of the HTML-Web flashing across hundreds of media outlets like a sequence of blinking Christmas-tree lights. That recognition led me to leave Individual and co-found EarthWeb in 1994 — because I wanted to be a part of building the Web, not just watching it! Today, just like in 1994 with HTML, it is much the same situation and again I am back to building again — Radar Networks, our stealth venture, is developing a new platform for the Metaweb that will open up a range of new capabilities for sharing metadata.

The baby Metaweb has already been born, but so far only the early-adopters and Web-veterans have noticed it. To those who “were there” the first time around there is a recognizable feeling of momentum — of “something big” happening again. It’s going to be a fun ride!


1. A new syndication format based on RSS is being proposed as an open standard. Called Atom it promises to provide a vendor neutral, extensible format for weblogging.

2. Why the term “Metaweb“? A reader suggested that the prefix “Meta” was too technical for consumers. I don’t think so however — after all they use the term “Internet” without any problem and that is not exactly a consumer-friendly word when you think about its meaning and origin. The concept of the Metaweb is that it is a new layer of the existing Web, that’s why the name should really contain “Web” in it.

Read Full Post »


How to Be Silicon ValleyMay 2006(This essay is derived from a keynote at Xtech.) Could you reproduce Silicon Valley elsewhere, or is there something unique about it?It wouldn’t be surprising if it were hard to reproduce in other countries, because you couldn’t reproduce it in most of the US either. What does it take to make a silicon valley even here?What it takes is the right people. If you could get the right ten thousand people to move from Silicon Valley to Buffalo, Buffalo would become Silicon Valley. [1]

That’s a striking departure from the past. Up till a couple decades ago, geography was destiny for cities. All great cities were located on waterways, because cities made money by trade, and water was the only economical way to ship.

Now you could make a great city anywhere, if you could get the right people to move there. So the question of how to make a silicon valley becomes: who are the right people, and how do you get them to move?

Two Types

I think you only need two kinds of people to create a technology hub: rich people and nerds. They’re the limiting reagents in the reaction that produces startups, because they’re the only ones present when startups get started. Everyone else will move.

Observation bears this out: within the US, towns have become startup hubs if and only if they have both rich people and nerds. Few startups happen in Miami, for example, because although it’s full of rich people, it has few nerds. It’s not the kind of place nerds like.

Whereas Pittsburgh has the opposite problem: plenty of nerds, but no rich people. The top US Computer Science departments are said to be MIT, Stanford, Berkeley, and Carnegie-Mellon. MIT yielded Route 128. Stanford and Berkeley yielded Silicon Valley. But Carnegie-Mellon? The record skips at that point. Lower down the list, the University of Washington yielded a high-tech community in Seattle, and the University of Texas at Austin yielded one in Austin. But what happened in Pittsburgh? And in Ithaca, home of Cornell, which is also high on the list?

I grew up in Pittsburgh and went to college at Cornell, so I can answer for both. The weather is terrible, particularly in winter, and there’s no interesting old city to make up for it, as there is in Boston. Rich people don’t want to live in Pittsburgh or Ithaca. So while there are plenty of hackers who could start startups, there’s no one to invest in them.

Not Bureaucrats

Do you really need the rich people? Wouldn’t it work to have the government invest in the nerds? No, it would not. Startup investors are a distinct type of rich people. They tend to have a lot of experience themselves in the technology business. This (a) helps them pick the right startups, and (b) means they can supply advice and connections as well as money. And the fact that they have a personal stake in the outcome makes them really pay attention.

Bureaucrats by their nature are the exact opposite sort of people from startup investors. The idea of them making startup investments is comic. It would be like mathematicians running Vogue— or perhaps more accurately, Vogue editors running a math journal. [2]

Though indeed, most things bureaucrats do, they do badly. We just don’t notice usually, because they only have to compete against other bureaucrats. But as startup investors they’d have to compete against pros with a great deal more experience and motivation.

Even corporations that have in-house VC groups generally forbid them to make their own investment decisions. Most are only allowed to invest in deals where some reputable private VC firm is willing to act as lead investor.

Not Buildings

If you go to see Silicon Valley, what you’ll see are buildings. But it’s the people that make it Silicon Valley, not the buildings. I read occasionally about attempts to set up “technology parks” in other places, as if the active ingredient of Silicon Valley were the office space. An article about Sophia Antipolis bragged that companies there included Cisco, Compaq, IBM, NCR, and Nortel. Don’t the French realize these aren’t startups?

Building office buildings for technology companies won’t get you a silicon valley, because the key stage in the life of a startup happens before they want that kind of space. The key stage is when they’re three guys operating out of an apartment. Wherever the startup is when it gets funded, it will stay. The defining quality of Silicon Valley is not that Intel or Apple or Google have offices there, but that they were started there.

So if you want to reproduce Silicon Valley, what you need to reproduce is those two or three founders sitting around a kitchen table deciding to start a company. And to reproduce that you need those people.


The exciting thing is, all you need are the people. If you could attract a critical mass of nerds and investors to live somewhere, you could reproduce Silicon Valley. And both groups are highly mobile. They’ll go where life is good. So what makes a place good to them?

What nerds like is other nerds. Smart people will go wherever other smart people are. And in particular, to great universities. In theory there could be other ways to attract them, but so far universities seem to be indispensable. Within the US, there are no technology hubs without first-rate universities– or at least, first-rate computer science departments.

So if you want to make a silicon valley, you not only need a university, but one of the top handful in the world. It has to be good enough to act as a magnet, drawing the best people from thousands of miles away. And that means it has to stand up to existing magnets like MIT and Stanford.

This sounds hard. Actually it might be easy. My professor friends, when they’re deciding where they’d like to work, consider one thing above all: the quality of the other faculty. What attracts professors is good colleagues. So if you managed to recruit, en masse, a significant number of the best young researchers, you could create a first-rate university from nothing overnight. And you could do that for surprisingly little. If you paid 200 people hiring bonuses of $3 million apiece, you could put together a faculty that would bear comparison with any in the world. And from that point the chain reaction would be self-sustaining. So whatever it costs to establish a mediocre university, for an additional half billion or so you could have a great one. [3]


However, merely creating a new university would not be enough to start a silicon valley. The university is just the seed. It has to be planted in the right soil, or it won’t germinate. Plant it in the wrong place, and you just create Carnegie-Mellon.

To spawn startups, your university has to be in a town that has attractions other than the university. It has to be a place where investors want to live, and students want to stay after they graduate.

The two like much the same things, because most startup investors are nerds themselves. So what do nerds look for in a town? Their tastes aren’t completely different from other people’s, because a lot of the towns they like most in the US are also big tourist destinations: San Francisco, Boston, Seattle. But their tastes can’t be quite mainstream either, because they dislike other big tourist destinations, like New York, Los Angeles, and Las Vegas.

There has been a lot written lately about the “creative class.” The thesis seems to be that as wealth derives increasingly from ideas, cities will prosper only if they attract those who have them. That is certainly true; in fact it was the basis of Amsterdam’s prosperity 400 years ago.

A lot of nerd tastes they share with the creative class in general. For example, they like well-preserved old neighborhoods instead of cookie-cutter suburbs, and locally-owned shops and restaurants instead of national chains. Like the rest of the creative class, they want to live somewhere with personality.

What exactly is personality? I think it’s the feeling that each building is the work of a distinct group of people. A town with personality is one that doesn’t feel mass-produced. So if you want to make a startup hub– or any town to attract the “creative class”– you probably have to ban large development projects. When a large tract has been developed by a single organization, you can always tell. [4]

Most towns with personality are old, but they don’t have to be. Old towns have two advantages: they’re denser, because they were laid out before cars, and they’re more varied, because they were built one building at a time. You could have both now. Just have building codes that ensure density, and ban large scale developments.

A corollary is that you have to keep out the biggest developer of all: the government. A government that asks “How can we build a silicon valley?” has probably ensured failure by the way they framed the question. You don’t build a silicon valley; you let one grow.


If you want to attract nerds, you need more than a town with personality. You need a town with the right personality. Nerds are a distinct subset of the creative class, with different tastes from the rest. You can see this most clearly in New York, which attracts a lot of creative people, but few nerds. [5]

What nerds like is the kind of town where people walk around smiling. This excludes LA, where no one walks at all, and also New York, where people walk, but not smiling. When I was in grad school in Boston, a friend came to visit from New York. On the subway back from the airport she asked “Why is everyone smiling?” I looked and they weren’t smiling. They just looked like they were compared to the facial expressions she was used to.

If you’ve lived in New York, you know where these facial expressions come from. It’s the kind of place where your mind may be excited, but your body knows it’s having a bad time. People don’t so much enjoy living there as endure it for the sake of the excitement. And if you like certain kinds of excitement, New York is incomparable. It’s a hub of glamour, a magnet for all the shorter half-life isotopes of style and fame.

Nerds don’t care about glamour, so to them the appeal of New York is a mystery. People who like New York will pay a fortune for a small, dark, noisy apartment in order to live in a town where the cool people are really cool. A nerd looks at that deal and sees only: pay a fortune for a small, dark, noisy apartment.

Nerds will pay a premium to live in a town where the smart people are really smart, but you don’t have to pay as much for that. It’s supply and demand: glamour is popular, so you have to pay a lot for it.

Most nerds like quieter pleasures. They like cafes instead of clubs; used bookshops instead of fashionable clothing shops; hiking instead of dancing; sunlight instead of tall buildings. A nerd’s idea of paradise is Berkeley or Boulder.


It’s the young nerds who start startups, so it’s those specifically the city has to appeal to. The startup hubs in the US are all young-feeling towns. This doesn’t mean they have to be new. Cambridge has the oldest town plan in America, but it feels young because it’s full of students.

What you can’t have, if you want to create a silicon valley, is a large, existing population of stodgy people. It would be a waste of time to try to reverse the fortunes of a declining industrial town like Detroit or Philadelphia by trying to encourage startups. Those places have too much momentum in the wrong direction. You’re better off starting with a blank slate in the form of a small town. Or better still, if there’s a town young people already flock to, that one.

The Bay Area was a magnet for the young and optimistic for decades before it was associated with technology. It was a place people went in search of something new. And so it became synonymous with California nuttiness. There’s still a lot of that there. If you wanted to start a new fad– a new way to focus one’s “energy,” for example, or a new category of things not to eat– the Bay Area would be the place to do it. But a place that tolerates oddness in the search for the new is exactly what you want in a startup hub, because economically that’s what startups are. Most good startup ideas seem a little crazy; if they were obviously good ideas, someone would have done them already.

(How many people are going to want computers in their houses? What, another search engine?)

That’s the connection between technology and liberalism. Without exception the high-tech cities in the US are also the most liberal. But it’s not because liberals are smarter that this is so. It’s because liberal cities tolerate odd ideas, and smart people by definition have odd ideas.

Conversely, a town that gets praised for being “solid” or representing “traditional values” may be a fine place to live, but it’s never going to succeed as a startup hub. The 2004 presidential election, though a disaster in other respects, conveniently supplied us with a county-by-county map of such places. [6]

To attract the young, a town must have an intact center. In most American cities the center has been abandoned, and the growth, if any, is in the suburbs. Most American cities have been turned inside out. But none of the startup hubs has: not San Francisco, or Boston, or Seattle. They all have intact centers. [7] My guess is that no city with a dead center could be turned into a startup hub. Young people don’t want to live in the suburbs.

Within the US, the two cities I think could most easily be turned into new silicon valleys are Boulder and Portland. Both have the kind of effervescent feel that attracts the young. They’re each only a great university short of becoming a silicon valley, if they wanted to.


A great university near an attractive town. Is that all it takes? That was all it took to make the original Silicon Valley. Silicon Valley traces its origins to William Shockley, one of the inventors of the transistor. He did the research that won him the Nobel Prize at Bell Labs, but when he started his own company in 1956 he moved to Palo Alto to do it. At the time that was an odd thing to do. Why did he? Because he had grown up there and remembered how nice it was. Now Palo Alto is suburbia, but then it was a charming college town– a charming college town with perfect weather and San Francisco only an hour away.

The companies that rule Silicon Valley now are all descended in various ways from Shockley Semiconductor. Shockley was a difficult man, and in 1957 his top people– “the traitorous eight”– left to start a new company, Fairchild Semiconductor. Among them were Gordon Moore and Robert Noyce, who went on to found Intel, and Eugene Kleiner, who founded the VC firm Kleiner Perkins. Forty-two years later, Kleiner Perkins funded Google, and the partner responsible for the deal was John Doerr, who came to Silicon Valley in 1974 to work for Intel.

So although a lot of the newest companies in Silicon Valley don’t make anything out of silicon, there always seem to be multiple links back to Shockley. There’s a lesson here: startups beget startups. People who work for startups start their own. People who get rich from startups fund new ones. I suspect this kind of organic growth is the only way to produce a startup hub, because it’s the only way to grow the expertise you need.

That has two important implications. The first is that you need time to grow a silicon valley. The university you could create in a couple years, but the startup community around it has to grow organically. The cycle time is limited by the time it takes a company to succeed, which probably averages about five years.

The other implication of the organic growth hypothesis is that you can’t be somewhat of a startup hub. You either have a self-sustaining chain reaction, or not. Observation confirms this too: cities either have a startup scene, or they don’t. There is no middle ground. Chicago has the third largest metropolitan area in America. As source of startups it’s negligible compared to Seattle, number 15.

The good news is that the initial seed can be quite small. Shockley Semiconductor, though itself not very successful, was big enough. It brought a critical mass of experts in an important new technology together in a place they liked enough to stay.


Of course, a would-be silicon valley faces an obstacle the original one didn’t: it has to compete with Silicon Valley. Can that be done? Probably.

One of Silicon Valley’s biggest advantages is its venture capital firms. This was not a factor in Shockley’s day, because VC funds didn’t exist. In fact, Shockley Semiconductor and Fairchild Semiconductor were not startups at all in our sense. They were subsidiaries– of Beckman Instruments and Fairchild Camera and Instrument respectively. Those companies were apparently willing to establish subsidiaries wherever the experts wanted to live.

Venture investors, however, prefer to fund startups within an hour’s drive. For one, they’re more likely to notice startups nearby. But when they do notice startups in other towns they prefer them to move. They don’t want to have to travel to attend board meetings, and in any case the odds of succeeding are higher in a startup hub.

The centralizing effect of venture firms is a double one: they cause startups to form around them, and those draw in more startups through acquisitions. And although the first may be weakening because it’s now so cheap to start some startups, the second seems as strong as ever. Three of the most admired “Web 2.0” companies were started outside the usual startup hubs, but two of them have already been reeled in through acquisitions.

Such centralizing forces make it harder for new silicon valleys to get started. But by no means impossible. Ultimately power rests with the founders. A startup with the best people will beat one with funding from famous VCs, and a startup that was sufficiently successful would never have to move. So a town that could exert enough pull over the right people could resist and perhaps even surpass Silicon Valley.

For all its power, Silicon Valley has a great weakness: the paradise Shockley found in 1956 is now one giant parking lot. San Francisco and Berkeley are great, but they’re forty miles away. Silicon Valley proper is soul-crushing suburban sprawl. It has fabulous weather, which makes it significantly better than the soul-crushing sprawl of most other American cities. But a competitor that managed to avoid sprawl would have real leverage. All a city needs is to be the kind of place the next traitorous eight look at and say “I want to stay here,” and that would be enough to get the chain reaction started.


[1] It’s interesting to consider how low this number could be made. I suspect five hundred would be enough, even if they could bring no assets with them. Probably just thirty, if I could pick them, would be enough to turn Buffalo into a significant startup hub.

[2] Bureaucrats manage to allocate research funding moderately well, but only because (like an in-house VC fund) they outsource most of the work of selection. A professor at a famous university who is highly regarded by his peers will get funding, pretty much regardless of the proposal. That wouldn’t work for startups, whose founders aren’t sponsored by organizations, and are often unknowns.

[3] You’d have to do it all at once, or at least a whole department at a time, because people would be more likely to come if they knew their friends were. And you should probably start from scratch, rather than trying to upgrade an existing university, or much energy would be lost in friction.

[4] Hypothesis: Any plan in which multiple independent buildings are gutted or demolished to be “redeveloped” as a single project is a net loss of personality for the city, with the exception of the conversion of buildings not previously public, like warehouses.

[5] A few startups get started in New York, but less than a tenth as many per capita as in Boston, and mostly in less nerdy fields like finance and media.

[6] Some blue counties are false positives (reflecting the remaining power of Democractic party machines), but there are no false negatives. You can safely write off all the red counties.

[7] Some “urban renewal” experts took a shot at destroying Boston’s in the 1960s, leaving the area around city hall a bleak wasteland, but most neighborhoods successfully resisted them.

Thanks to Chris Anderson, Trevor Blackwell, Marc Hedlund, Jessica Livingston, Robert Morris, Greg Mcadoo, Fred Wilson, and Stephen Wolfram for reading drafts of this, and to Ed Dumbill for inviting me to speak.

(The second part of this talk became Why Startups Condense in America.)

Comment on this essay.

pad VC Deals by Region
pad pad Startup Jobs by Region
pad They Would Be Gods
pad pad Interview: Richard Hodgson
pad Santa Clara Valley, 1971
pad pad Scattered Abroad
pad Russian Translation
pad pad Spanish Translation
pad Japanese Translation

If you liked this, you may also like Hackers & Painters.

Read Full Post »

Older Posts »