July 11, 2006
This is a more extensive version of the Web 3.0 article with extra sections about the implications of Web 3.0 to Google.
See this follow up article for the more disruptive ‘decentralized kowledgebase’ version of the model discussed in article.
Also see this non-Web3.0 version: P2P to Destroy Google, Yahoo, eBay et al
Web 3.0 Developers:
Feb 5, ‘07: The following reference should provide some context regarding the use of rule-based inference engines and ontologies in implementing the Semantic Web + AI vision (aka Web 3.0) but there are better, simpler ways of doing it.
In Web 3.0 (aka Semantic Web) P2P Inference Engines running on millions of users’ PCs and working with standardized domain-specific ontologies (created by Wikipedia, Ontoworld, other organizations or individuals) using Semantic Web tools, including Semantic MediaWiki, will produce an infomration infrastructure far more powerful than Google (or any current search engine.)
The availability of standardized ontologies that are being created by people, organizations, swarms, smart mobs, e-societies, etc, and the near-future availability of P2P Semantic Web Inference Engines that work with those ontologies means that we will be able to build an intelligent, decentralized, “P2P” version of Google.
Thus, the emergence of P2P Inference Engines and domain-specific ontologies in Web 3.0 (aka Semantic Web) will present a major threat to the central “search” engine model.
Basic Web 3.0 Concepts
A knowledge domain is something like Physics, Chemistry, Biology, Politics, the Web, Sociology, Psychology, History, etc. There can be many sub-domains under each domain each having their own sub-domains and so on.
Information vs Knowledge
To a machine, knowledge is comprehended information (aka new information produced through the application of deductive reasoning to exiting information). To a machine, information is only data, until it is processed and comprehended.
For each domain of human knowledge, an ontology must be constructed, partly by hand [or rather by brain] and partly with the aid of automation tools.
Ontologies are not knowledge nor are they information. They are meta-information. In other words, ontologies are information about information. In the context of the Semantic Web, they encode, using an ontology language, the relationships between the various terms within the information. Those relationships, which may be thought of as the axioms (basic assumptions), together with the rules governing the inference process, both enable as well as constrain the interpretation (and well-formed use) of those terms by the Info Agents to reason new conclusions based on existing information, i.e. to think. In other words, theorems (formal deductive propositions that are provable based on the axioms and the rules of inference) may be generated by the software, thus allowing formal deductive reasoning at the machine level. And given that an ontology, as described here, is a statement of Logic Theory, two or more independent Info Agents processing the same domain-specific ontology will be able to collaborate and deduce an answer to a query, without being driven by the same software.
In the context of Web 3.0, Inference engines will be combining the latest innovations from the artificial intelligence (AI) field together with domain-specific ontologies (created as formal or informal ontologies by, say, Wikipedia, as well as others), domain inference rules, and query structures to enable deductive reasoning on the machine level.
Info Agents are instances of an Inference Engine, each working with a domain-specific ontology. Two or more agents working with a shared ontology may collaborate to deduce answers to questions. Such collaborating agents may be based on differently designed Inference Engines and they would still be able to collaborate.
Proofs and Answers
The interesting thing about Info Agents that I did not clarify in the original post is that they will be capable of not only deducing answers from existing information (i.e. generating new information [and gaining knowledge in the process, for those agents with a learning function]) but they will also be able to formally test propositions (represented in some query logic) that are made directly or implied by the user. For example, instead of the example I gave previously (in the Wikipedia 3.0 article) where the user asks “Where is the nearest restaurant that serves Italian cuisine” and the machine deduces that a pizza restaurant serves Italian cuisine, the user may ask “Is the moon blue?” or say that the “moon is blue” to get a true or false answer from the machine. In this case, a simple Info Agent may answer with “No” but a more sophisticated one may say “the moon is not blue but some humans are fond of saying ‘once in a blue moon’ which seems illogical to me.”
This test-of-truth feature assumes the use of an ontology language (as a formal logic system) and an ontology where all propositions (or formal statements) that can be made can be computed (i.e. proved true or false) and were all such computations are decidable in finite time. The language may be OWL-DL or any language that, together with the ontology in question, satisfy the completeness and decidability conditions.
P2P 3.0 vs Google
If you think of how many processes currently run on all the computers and devices connected to the Internet then that should give you an idea of how many Info Agents can be running at once (as of today), all reasoning collaboratively across the different domains of human knowledge, processing and reasoning about heaps of information, deducing answers and deciding truthfulness or falsehood of user-stated or system-generated propositions.
Web 3.0 will bring with it a shift from centralized search engines to P2P Semantic Web Inference Engines, which will collectively have vastly more deductive power, in both quality and quantity, than Google can ever have (included in this exclusion is any future AI-enabled version of Google, as it will not be able to keep up with the distributed P2P AI matrix that will be enabled by millions of users running free P2P Semantic Web Inference Engine software on their home PCs.)
Thus, P2P Semantic Web Inference Engines will pose a huge and escalating threat to Google and other search engines and will expectedly do to them what P2P file sharing and BitTorrent did to FTP (central-server file transfer) and centralized file hosting in general (e.g. Amazon’s S3 use of BitTorrent.)
In other words, the coming of P2P Semantic Web Inference Engines, as an integral part of the still-emerging Web 3.0, will threaten to wipe out Google and other existing search engines. It’s hard to imagine how any one company could compete with 2 billion Web users (and counting), all of whom are potential users of the disruptive P2P model described here.
“The Future Has Arrived But It’s Not Evenly Distributed”
Currently, Semantic Web (aka Web 3.0) researchers are working out the technology and human resource issues and folks like Tim Berners-Lee, the Noble prize recipient and father of the Web, are battling critics and enlightening minds about the coming human-machine revolution.
The Semantic Web (aka Web 3.0) has already arrived, and Inference Engines are working with prototypical ontologies, but this effort is a massive one, which is why I was suggesting that its most likely enabler will be a social, collaborative movement such as Wikipedia, which has the human resources (in the form of the thousands of knowledgeable volunteers) to help create the ontologies (most likely as informal ontologies based on semantic annotations) that, when combined with inference rules for each domain of knowledge and the query structures for the particular schema, enable deductive reasoning at the machine level.
On AI and Natural Language Processing
I believe that the first generation of AI that will be used by Web 3.0 (aka Semantic Web) will be based on relatively simple inference engines (employing both algorithmic and heuristic approaches) that will not attempt to perform natural language processing. However, they will still have the formal deductive reasoning capabilities described earlier in this article.
- Wikipedia 3.0: The End of Google?
- Intelligence (Not Content) is King in Web 3.0
- Get Your DBin
- All About Web 3.0
Posted by Marc Fawzi
Enjoyed this analysis? You may share it with others on: