The People’s Google
In Uncategorized on July 11, 2006 at 10:16 am
Author: Marc Fawzi
This is a follow-up to the Wikipedia 3.0 article.
See this article for a more disruptive ‘decentralized kowledgebase’ version of the model discussed here.
Also see this non-Web3.0 version: P2P to Destroy Google, Yahoo, eBay et al
Web 3.0 Developers:
Feb 5, ‘07: The following reference should provide some context regarding the use of rule-based inference engines and ontologies in implementing the Semantic Web + AI vision (aka Web 3.0) but there are better, simpler ways of doing it.
In Web 3.0 (aka Semantic Web), P2P Inference Engines running on millions of users’ PCs and working with standardized domain-specific ontologies (that may be created by entities like Wikipedia and other organizations) using Semantic Web tools will produce an information infrastructure far more powerful than the current infrastructure that Google uses (or any Web 1.0/2.0 search engine for that matter.)
Having the sandardized ontologies and the P2P Semantic Web Inference Engines that work with those ontologies will lead to a more intelligent, “Massively P2P” version of Google.
Therefore, the emergence in Web 3.0 of said P2P Inference Engines combined with standardized domain-specific ontologies will present a major threat to the central “search” engine model.
Basic Web 3.0 Concepts
A knowledge domain is something like Physics, Chemistry, Biology, Politics, the Web, Sociology, Psychology, History, etc. There can be many sub-domains under each domain each having their own sub-domains and so on.
Information vs Knowledge
To a machine, knowledge is comprehended information (aka new information that is produced via the application of deductive reasoning to exiting information). To a machine, information is only data, until it is reasoned about.
For each domain of human knowledge, an ontology must be constructed, partly by hand and partly with the aid of dialog-driven ontology construction tools.
Ontologies are not knowledge nor are they information. They are meta-information. In other words, ontologies are information about information. In the context of the Semantic Web, they encode, using an ontology language, the relationships between the various terms within the information. Those relationships, which may be thought of as the axioms (basic assumptions), together with the rules governing the inference process, both enable as well as constrain the interpretation (and well-formed use) of those terms by the Info Agents to reason new conclusions based on existing information, i.e. to think. In other words, theorems (formal deductive propositions that are provable based on the axioms and the rules of inference) may be generated by the software, thus allowing formal deductive reasoning at the machine level. And given that an ontology, as described here, is a statement of Logic Theory, two or more independent Info Agents processing the same domain-specific ontology will be able to collaborate and deduce an answer to a query, without being driven by the same software.
In the context of Web 3.0, Inference engines will be combining the latest innovations from the artificial intelligence (AI) field together with domain-specific ontologies (created as formal or informal ontologies by, say, Wikipedia, as well as others), domain inference rules, and query structures to enable deductive reasoning on the machine level.
Info Agents are instances of an Inference Engine, each working with a domain-specific ontology. Two or more agents working with a shared ontology may collaborate to deduce answers to questions. Such collaborating agents may be based on differently designed Inference Engines and they would still be able to collaborate.
Proofs and Answers
The interesting thing about Info Agents that I did not clarify in the original post is that they will be capable of not only deducing answers from existing information (i.e. generating new information [and gaining knowledge in the process, for those agents with a learning function]) but they will also be able to formally test propositions (represented in some query logic) that are made directly -or implied- by the user.
P2P 3.0 vs Google
If you think of how many processes currently run on all the computers and devices connected to the Internet then that should give you an idea of how many Info Agents can be running at once (as of today), all reasoning collaboratively across the different domains of human knowledge, processing and reasoning about heaps of information, deducing answers and deciding truthfulness or falsehood of user-stated or system-generated propositions.
Web 3.0 will bring with it a shift from centralized search engines to P2P Semantic Web Inference Engines, which will collectively have vastly more deductive power, in both quality and quantity, than Google can ever have (included in this assumption is any future AI-enabled version of Google, as it will not be able to keep up with the power of P2P AI matrix that will be enabled by millions of users running free P2P Semantic Web Inference Engine software on their home PCs.)
Thus, P2P Semantic Web Inference Engines will pose a huge and escalating threat to Google and other search engines and will expectedly do to them what P2P file sharing and BitTorrent did to FTP (central-server file transfer) and centralized file hosting in general (e.g. Amazon’s S3 use of BitTorrent.)
In other words, the coming of P2P Semantic Web Inference Engines, as an integral part of the still-emerging Web 3.0, will threaten to wipe out Google and other existing search engines. It’s hard to imagine how any one company could compete with 2 billion Web users (and counting), all of whom are potential users of the disruptive P2P model described here.
Currently, Semantic Web (aka Web 3.0) researchers are working out the technology and human resource issues and folks like Tim Berners-Lee, the Noble prize recipient and father of the Web, are battling critics and enlightening minds about the coming semantic web revolution.
In fact, the Semantic Web (aka Web 3.0) has already arrived, and Inference Engines are working with prototypical ontologies, but this effort is a massive one, which is why I was suggesting that its most likely enabler will be a social, collaborative movement such as Wikipedia, which has the human resources (in the form of the thousands of knowledgeable volunteers) to help create the ontologies (most likely as informal ontologies based on semantic annotations) that, when combined with inference rules for each domain of knowledge and the query structures for the particular schema, enable deductive reasoning at the machine level.
On AI and Natural Language Processing
I believe that the first generation of AI that will be used by Web 3.0 (aka Semantic Web) will be based on relatively simple inference engines that will NOT attempt to perform natural language processing, where current approaches still face too many serious challenges. However, they will still have the formal deductive reasoning capabilities described earlier in this article, and users would interact with these systems through some query language.
- Wikipedia 3.0: The End of Google?
- Intelligence (Not Content) is King in Web 3.0
- Get Your DBin
- All About Web 3.0
Semantic Web, Web strandards, Trends, OWL, Google, inference engine, AI, ontology, Web 2.0, Web 3.0, AI, Wikipedia, Wikipedia 3.0, collective consciousness, Ontoworld, AI Engine, OWL-DL, Semantic MediaWiki, P2P 3.0