Feeds:
Posts
Comments

Posts Tagged ‘Semanticweb’

Evolving Trends

Web 3.0

Historically speaking, the first coining of Web 3.0 in conjunction with Semantic Web and/or AI agents and the first coining of  Web 3.0 in conjunction with Wikipedia and/or Google was made in the Wikipedia 3.0: The End of Google? article, which was published on Evolving Trends (this blog) on June 26, ‘06.

June 28, ‘06: Here’s what a fellow blogger, who had reviewed the Wikipedia 3.0 article, had to say:

“[…] But there it is. That was then. Now, it seems, the rage is Web 3.0. It all started
with this article here addressing the Semantic Web, the idea that a new organizational
structure for the web ought to be based on concepts that can be interpreted. The idea is
to help computers become learning machines, not just pattern matchers and calculators. […]“

June 28, ‘06: A fellow blogger wrote:

“This is the first non-sarcastic reference to Web 3.0 I’ve seen in the wild”

As of Jan 25, there are 11,000 links to Evolving Trends from blogs, forums and news sites pointing to the Wikipedia 3.0 article.

Jan 25, ‘07: A fellow blogger wrote:

“In 2004 I with my friend Aleem worked on idea of Semantic Web (as our senior project), and now I have been hearing news of Web 3.0. I decided to work on the idea further in 2005, and may be we could have made a very small scaled 4th generation search engine. Though this has never become reality but now it seems it’s hot time for putting Semantics and AI into web. Reading about Web 3.0 again thrilled me with the idea. [Wikia] has decided to jump into search engines and give Google a tough time :). So I hope may be I get a chance to become part of this Web 3.0 and make information retreival better.”

Alexa graph

According to Alexa the Wikipedia 3.0: The End of Google? article estimated penetration peaked on June 28 at a ratio of 650 per each 1,000,000 people. Based on an estimated number of 1,000,000,000 Web users, this means that it reached 650,000 people on June 28, and other hundreds of thousands of people on June 26, 27, 29, 30. This includes people who read the article at about 6,000 sites (according to MSN) that had linked to Evolving Trends. Based on the Alexa graph, we could estimate that the article reach close to 2 million people in the first 4.5 days of its release.

Update on Alexa Statistics (Sep. 18, 2008): some people have pointed out (independently with respect to their own experience) that Alexa’s statistics are skewed and not very reliable. As far as the direct hits to the on this blog they’re in the 200,000 range as of this writing.


Note: the term “Web 3.0″ is the dictionary word “Web” followed by the number “3″, a decimal point and the number “0.” As such, the term itself cannot and should not have any commercial significance in any context.  

Update on how the Wikipedia 3.0 vision is spreading:


Update on how Google is hopelessly attempting to co-opt the Wikipedia 3.0 vision:  

Web 3D + Semantic Web + AI as Web 3.0:  

Here is the original article that gave birth to the Web 3.0 vision:

3D Web + Semantic Web + AI *

The above mentioned Web 3D + Semantic Web + AI vision which preceded the Wikipeda 3.0 vision received much less attention because it was not presented in a controversial manner. This fact was noted as the biggest flaw of social bookmarking site digg which was used to promote this article.

Developers:

Feb 5, ‘07: The following external reference concerns the use of rule-based inference engines and ontologies in implementing the Semantic Web + AI vision (aka Web 3.0):

  1. Description Logic Programs: Combining Logic Programs with Description Logic (note: there are better, simpler ways of achieving the same purpose.)

Jan 7, ‘07: The following Evolving Trends post discussing current state of semantic search engines and ways to improve the design:

  1. Designing a Better Web 3.0 Search Engine

The idea described in this article was adopted by Hakia after it was published here, so this article may be considered as prior art.

June 27, ‘06: Semantic MediaWiki project, enabling the insertion of semantic annotations (or metadata) into Wikipedia content (This project is now hosted by Wikia, Wikipedia founder Jimmy wales’ private venture, and may benefit Wikia instead of Wikipedia, which is why I see it as a conflict of interest.)

Bloggers:

This post provides the history behind use of the term Web 3.0 in the context of the Semantic Web and AI.

This post explains the accidental way in which this article reaching 2 million people in 4 days.


Web 3.0 Articles on Evolving Trends

Noteworthy mentions of the Wikipedia 3.0 article:

Tags:

Semantic Web, Web strandards, Trends, OWL, Googleinference, inference engine, AI, ontology, Semanticweb, Web 2.0, Web 2.0, Web 3.0, Web 3.0, Wikipedia, Wikipedia 3.0Wikipedia AI, P2P 3.0, P2P AI, P2P Semantic Web inference Engineintelligent findability

Evolving Trends is Powered by +[||||]- 42V

Read Full Post »

Evolving Trends

July 2, 2006

Digg This! 55,500 hits in ~4 Days

/* (this post was last updated at 10:30am EST, July 3, ‘06, GMT +5)

This post is a follow up to the previous post For Great Justice, Take Off Every Digg

According to Alexa.com, the total penetration of the Wikipedia 3.0 article was ~2 million readers (who must have read it on other websites that copied the article)

*/

EDIT: I looked at the graph and did the math again, and as far as I can tell it’s “55,500 in ~4 days” not “55,000 in 5 days.” So that’s 13,875 page views per each day.

Stats (approx.) for the “Wikipedia 3.0: The End of Google?” and “For Great Justice, Take Off Every Digg articles:

These are to the best of my memory from each of the first ~4 days as verified by the graph.

33,000 page views in day 1 (the first wave)

* day 1 is almost one and a half columns on the graph not one because I posted it at ~5:00am and the day (in WordPress time zone) ends at 8pm, so the first column is only ~ 15 hours.

9,500 page views in day 2

5,000 page views in day 3

8,000 page views in day 4 (the second wave)

Total: 55,500 in ~4 days which is 13,875 page views per day (not server hits) for ~4 days. Now on the 7th day the traffic is expected to be ~1000 page views, unless I get another small spike. That’s a pretty good double-dipping long tail. If you’ve done better with digg let me know how you did it! 🙂

Experiment

This post is a follow-up to my previous article on digg, where I explained how I had experimented and succeeded in generating 45,000 visits to an article I wrote in the first 3 days of its release (40,000 of which came directly from digg.)

I had posted an article on digg about a bold but well-thought out vision of the future, involving Google and Wikipedia, with the sensational title of “Wikipedia 3.0: The End of Google?” (which may turn out after all to be a realistic proposition.)

Since my previous article on digg I’ve found out that digg did not ban my IP address. They had deleted my account due to multiple submissions. So I was able to get back with a new user account and try another the experiment: I submitted “AI Matrix vs Google” and “Web 3.0 vs Google” as two separate links for one article (which has since been given the final title of “Web 3.0.” [July 12, ‘06, update: see P2P 3.0: The People’s Google)

Results

Neither ’sensational’ title worked.

Analysis

I tried to rationalize what happened …

I figured that the crowd wanted a showdown between two major cults (e.g the Google fans and the Wikipedia fans) and not between Google and some hypothetical entity (e.g. AI Matrix or Web 3.0).

But then I thought about how Valleywag was able to cleverly piggyback on my “Wikipedia 3.0: The End of Google?” article (which had generated all the hype) with an article having the dual title of “Five Reasons Google Will Invent Real AI” on digg and “Five Reasons No One Will Replace Google” on Valleywag.

They used AI in the title and I did the same in the new experiment, so we should both get lots of diggs. They got about 1300 diggs. I got about 3. Why didn’t it work in my case?

The answer is that the crowd is not a logical animal. It’s a psychological animal. It does not make mental connections as we do as individuals (because a crowd is a randomized population that is made up of different people at different times) so it can’t react logically.

Analyzing it from the psychological frame, I concluded that it must have been the Wikipedia fans who “dugg” my original article. The Google fans did “digg” it but not in the same large percentage as the Wikipedia fans.

Valleywag gave the Google fans the relief they needed after my article with its own article in defense of Google. However, when I went at it again with “Matrix AI vs Google” and “Web 3.0 vs Google” the error I made was in not knowing that the part of the crowd that “dugg” my original article were the Wikipedia fans not the Goolge haters. In fact, Google haters are not very well represented on digg. In other words, I found out that “XYZ vs Google” will not work on digg unless XYZ has a large base of fans on digg.

Escape Velocity

The critical threshold in the digg traffic generation process is to get enough diggs quickly enough, after submitting the post, to get the post on digg’s popular page. Once the post is on digg’s popular page both sides (those who like what your post is about and those who will hate you and want to kill you for writing it) will affected by the psychlogical manipulation you design (aka the ‘wave.’) However, the majority of those who will “digg” it will be from the group that likes it. A lesser number of people will “digg” it from the group that hates it.

Double Dipping

I did have a strong second wave when I went out and explained how ridiculous the whole digg process is.

This is how the second wave was created:

I got lots of “diggs” from Wikipedia fans and traffic from both Google and Wikipedia fans for the original article.

Then I wrote a follow up on why “digg sucks” but only got 100 “diggs” for it (because all the digg fans on digg kept ‘burying’ it!) so I did not get much traffic to it from digg fans or digg haters (not that many of the latter on digg.)

The biggest traffic to it came from the bloggers and others who came to see what the all fuss was about as far as the original article. I had linked to the follow up article (on why I thought digg sucked) from the original article (i.e. like chaining magnets) so when people came to see what the fuss was all about with respect to the original article they were also told to check out the “digg sucks” article for context.

That worked! The original and second waves, which both had a long tail (see below) generated a total of 55,500 hits in ~4 days. That’s 13,875 page views a day for the first ~4 days.

Long Tail vs Sting

I know that some very observant bloggers have said that digg can only produce a sharp, short lived pulse of traffic (or a sting), as opposed to a long tail or a double-dipping long tail, as in my case, but those observations are for posts that are not themselves memes. When you have a meme you get the long tail (or an exponential decay) and when you chain memes as I did (which I guess I could have done faster as the second wave would have been much bigger) then you get a double-dipping long tail as I’m having now.

Today (which is 7 days after the original experiment) the traffic is over 800 hits so far, still on the strength of the original wave and the second wave (note that the flat like I had before the spike represents levels of traffic between ~100 to ~800, so don’t be fooled by the flatness, it’s relative to the scale of the graph.)

In other words, traffic is still going strong from the strength of the long-tail waves generated from the original post and the follow up one.

double

Links

  1. Wikipedia 3.0: The End of Google?
  2. For Great Justice, Take Off Every Digg
  3. Unwisdom of Crowds
  4. Self-Aware e-Society

Posted by Marc Fawzi

Tags:
Semantic Web, Web strandards, Trends, wisdom of crowds, tagging, Startup, mass psychology, Google, cult psychology, inference, inference engine, AI, ontology, Semanticweb, Web 2.0, Web 2.0, Web 3.0, Web 3.0, Google Base, artificial intelligence, AI, Wikipedia, Wikipedia 3.0, collective consciousness, digg, censorship

15 Comments »

  1. Update this in two weeks, after a Friday, Saturday, and Sunday, and a holiday in the middle of the week in the United States which means a lot of people are on vacation, and another weekend, and see what happens with traffic trends, including Digg related traffic. And check out my unscientific reseach on when the best time and day to post is on your blog, and compare what you find over the course of time, not just a couple days. I’m curious how days of the week and the informal research I did might reflect within your information. That will REALLY help us see the reality of your success.Still, you’ve gathered a ton of fabulous information. I found it interesting that the post title on your Digg sucks article kept changing every hour or so on the WordPress.com top lists. I think it was “Power of the Schwartz” that really caught my eye. 😉

    I wish you could check out how much traffic came from WordPress.com dashboards and top blog listing comparatively to Digg traffic results, as well as all the other social bookmarking sources which pick up Digg posts, and compare that information as to how directly your traffic was related solely to Digg. It was in the first place, but “then” what happened.

    There is a lot of whack things that go into driving traffic, and I also know that WordPress.com’s built in traffic charts don’t match up exactly and consistently with some of the external traffic reports I’ve checked for my WordPress.com blog, so only time will tell, and this will get more and more interesting as time goes on.

    Good work!

    Comment by Lorelle VanFossen — July 2, 2006 @ 11:19 am

  2. Yeah I caught myself saying “Merchandising Merchandising Merchandising” the other day!:)

    Well I noticed about 1000, 800, 600, 500 hits (in this order) from WordPress for those 4 days …

    Valleywag sent me about 12,000 (in total)

    Marc

    Comment by evolvingtrends — July 2, 2006 @ 11:26 am

  3. Great analysis on digg. It looks like digg or the memes can be somewhat influenced and analyzed. It’s almost like psycho analyzing a strange new brain.I find it very interesting how this all happened. Even if digg gave you a short pulse for a few days, it generated augmented daily traffic until now. I wouldn’t be surprised that new readers discovered you this way. The whole applications of traffic and readers are very fluid in nature. I wonder if they could be mapped in some way of form through fluid dynamics.

    Cheers

    Comment by range — July 3, 2006 @ 1:39 am

  4. It’s highly multi-disciplinary. It can be conquered but not as fast as you or I would like.This is like analyzing a strange new brain … a brain that is influenced greatly by everything except logic.

    I plan on analyzing it in the open for a long time to come, so stick around and add your thoughts to it. 🙂
    They say ‘observing something changes its outcome’ .. So we’ll see how it goes.

    Cheers,

    Marc

    Comment by evolvingtrends — July 3, 2006 @ 2:36 am

  5. […] 1. Digg This! 55,500 Hits in ~4 Days […]Pingback by Evolving Trends » Global Brain vs Google — July 3, 2006 @ 10:37 am
  6. […] This article has a follow-up part: Digg This! 55,500 Hits in ~4 Days […]Pingback by Evolving Trends » For Great Justice, Take Off Every Digg — July 3, 2006 @ 10:57 am
  7. Marc,I don’t know if this information helps or skews your research, but a post I wrote in January, titled to get Digg and other traffic attention, Horse Sex and What is Dictating Your Blog’s Content, did not do well at all. That is until the past three days.

    It’s really started piling up a lot of hits, sitting in the top 10 of my top posts, outreaching the other posts that get consistently high traffic by a huge margin. Until Saturday, that post was not even in the top 50 or 75. I can’t tell where the traffic is suddenly coming from, as WordPress.com doesn’t offer that kind of specific information, and I’m not getting any outstanding traffic from any single source. Nothing from Digg, but something is suddenly driving that post through the roof. Even during a holiday week in the US! Very strange.

    Maybe there’s a new fad in horse sex lately – who knows? 😉

    Still, the point is that this was written in January, and now it is getting attention in July. I’ll be checking to find out what is causing the sudden flush of traffic, but never doubt that your posts are ageless in many respects. So the long term study of Digg patterns and traffic will help all of us over the “long haul”. That’s why I’m really curious about the long term effects of your posts.

    Sometimes you just can’t predict the crowds. 😉 Or what they will suddenly be interested in. I’ve written so many posts and titles that I was sure would skyrocket traffic, only to lay there like empty beer bottles in the playground. Totally useless. And others with sloppy titles and written quickly with little attention to detail skyrocketing like 1000 bottles of coke filled with Mentos. 😉 It’s an interesting process, isn’t it?

    Comment by Lorelle VanFossen — July 3, 2006 @ 9:37 pm

  8. Predicting the weather for the long term is not currently feasible. However, predicting the weather for the short term is (1-2 days in davance.)But it’s not all about ‘predicting’ … It’s about studying the phenomenon so that we can make better choices to reduce the effect of uncertainty and not try to eliminate uncertainty.

    Marc

    Comment by evolvingtrends — July 4, 2006 @ 12:02 am

  9. I think then that the obvious question is why you’ve done nothing to monetize those hits, however fickle they might be!;)

    Comment by Sam Jackson — July 4, 2006 @ 4:42 pm

  10. Monetize, Monetize, Monetize!Fortunately, that won’t happen 🙂

    Marc

    Comment by evolvingtrends — July 4, 2006 @ 8:28 pm

  11. […] 4 – Digg This! 55,500 hits in ~4 Days A blogger explains how he ‘milked’ Digg for a major spike in traffic. Meme engineering in action; fascinating stuff. (tags: Wikipedia Google visits article post tail long spike scam traffic blogging blog meme Digg) […]Pingback by Velcro City Tourist Board » Blog Archive » Links for 05-07-2006 — July 4, 2006 @ 10:20 pm
  12. Since web traffic is dictated by humans and engines and not by some exterior force like the weather, I think that there are a lot of possible venues of analysis of it. The only thing is that the flow and traffic needs to be documented. In most cases, the traffic might be, but there lacks information on past flow. The internet is concentrated on the now and less with what happened ten days ago on this site and such.Mathematical Fluid dynamics are probably the way to go, though even if I am a mathematician, I’d have to research it a bit before pronouncing myself completely. These types of analysis can get quite complicated because of the implications of partial differential equations of an order higher than 2, which can not be solved only approximated numerically.

    I’m sure I’m not the only one to say this, but I like the types of discussions and content that you put forward, it gets the mind thinking on certain subjects that most of the time users tend to accept without question.

    Comment by range — July 4, 2006 @ 10:54 pm

  13. “the implications of partial differential equations of an order higher than 2, which can not be solved only approximated numerically.”Have you looked into Meyer’s methods of “invariant embedding” …? to convert PDEs to a set of ordinary differential equations then solve?

    I believe the investigation of hype management is extremely multi-disciplinary and very much like the weather. That means that while it’s deterministic (as everything is in essence with the exception of non-causal quantum theory) it’s still highly unstable and ultimately hard [in computationl terms] to predict.

    In general, uncertainty exists in every system, including maths itself (because of lack of absolute consistency and incompleteness), so while you can’t eliminate it you can hope to reduce it.

    But in practical terms, what I’m looking to do is to simply gain a sufficient minimum in insight to allow me to improve my chances at generating and surfing hype waves… I believe I will end up applying a non-formal theory such as framing theory to transform the problem from the computational domain to the cognitive domain (so I may use that 90% of the brain that we supposedly don’t use to carry out the computation with my own internal computational model.)

    Clarity, in simple terms, is what it’s all about.

    However, to reach clarity’s peak you have to climb a mountain of complexity 🙂

    Marc

    Comment by evolvingtrends — July 4, 2006 @ 11:10 pm

  14. Hey Marc!I now know what it feels like to be caught in a digg like wave. Right now, I have had over 141000 page views because of a post that I did this morning, explaining HDR photography.

    Since digg banned my url for some reason (I don’t know why, I haven’t posted anything to digg in the last 2 months), this was all done through del.icio.us, Reddit and Popurls. It’s like one thing leads to another. I have added an url giving a surface analysis of this situation.

    http://range.wordpress.com/2006/07/15/how-the-memoirs-got-127000-hits-in-a-few-hours-or-a-follow-up-post-to-modern-hdr-photography/

    Naturally, I find myself compelled to continue writing on the subject. I have already posted a follow-up article and I am working on another one right now. I knew I had a spike on weekends, nothing like this however.

    Comment by range — July 15, 2006 @ 7:29 pm

  15. Hey Marc.I think the main reason why I didn’t get any higher was because of the stat problem that WP has been having over the last few days.

    I hope they save this traffic so that I have some nice graphs to show you. They probably do. It felt like the counter was accurate, I checked out that I did indeed make onto a few memediggers, still am right now.

    And also the stat page was just so slow to catch up with the amount of traffic that was generated. WP couldn’t keep up.

    Hopefully, they will sort it out over the next few days. I think it was most surprising in the afternoon. I kept refreshing the counter, and oups, a few thousand here, ten thousand there. I was really surprised. And I have also started getting some haters, as you must know, with the good comes the bad.

    Comment by range — July 15, 2006 @ 8:49 pm

Read Full Post »

Evolving Trends

July 11, 2006

P2P 3.0: The People’s Google

/*

This is a more extensive version of the Web 3.0 article with extra sections about the implications of Web 3.0 to Google.

See this follow up article  for the more disruptive ‘decentralized kowledgebase’ version of the model discussed in article.

Also see this non-Web3.0 version: P2P to Destroy Google, Yahoo, eBay et al 

Web 3.0 Developers:

Feb 5, ‘07: The following reference should provide some context regarding the use of rule-based inference engines and ontologies in implementing the Semantic Web + AI vision (aka Web 3.0) but there are better, simpler ways of doing it. 

  1. Description Logic Programs: Combining Logic Programs with Description Logic

*/

In Web 3.0 (aka Semantic Web) P2P Inference Engines running on millions of users’ PCs and working with standardized domain-specific ontologies (created by Wikipedia, Ontoworld, other organizations or individuals) using Semantic Web tools, including Semantic MediaWiki, will produce an infomration infrastructure far more powerful than Google (or any current search engine.)

The availability of standardized ontologies that are being created by people, organizations, swarms, smart mobs, e-societies, etc, and the near-future availability of P2P Semantic Web Inference Engines that work with those ontologies means that we will be able to build an intelligent, decentralized, “P2P” version of Google.

Thus, the emergence of P2P Inference Engines and domain-specific ontologies in Web 3.0 (aka Semantic Web) will present a major threat to the central “search” engine model.

Basic Web 3.0 Concepts

Knowledge domains

A knowledge domain is something like Physics, Chemistry, Biology, Politics, the Web, Sociology, Psychology, History, etc. There can be many sub-domains under each domain each having their own sub-domains and so on.

Information vs Knowledge

To a machine, knowledge is comprehended information (aka new information produced through the application of deductive reasoning to exiting information). To a machine, information is only data, until it is processed and comprehended.

Ontologies

For each domain of human knowledge, an ontology must be constructed, partly by hand [or rather by brain] and partly with the aid of automation tools.

Ontologies are not knowledge nor are they information. They are meta-information. In other words, ontologies are information about information. In the context of the Semantic Web, they encode, using an ontology language, the relationships between the various terms within the information. Those relationships, which may be thought of as the axioms (basic assumptions), together with the rules governing the inference process, both enable as well as constrain the interpretation (and well-formed use) of those terms by the Info Agents to reason new conclusions based on existing information, i.e. to think. In other words, theorems (formal deductive propositions that are provable based on the axioms and the rules of inference) may be generated by the software, thus allowing formal deductive reasoning at the machine level. And given that an ontology, as described here, is a statement of Logic Theory, two or more independent Info Agents processing the same domain-specific ontology will be able to collaborate and deduce an answer to a query, without being driven by the same software.

Inference Engines

In the context of Web 3.0, Inference engines will be combining the latest innovations from the artificial intelligence (AI) field together with domain-specific ontologies (created as formal or informal ontologies by, say, Wikipedia, as well as others), domain inference rules, and query structures to enable deductive reasoning on the machine level.

Info Agents

Info Agents are instances of an Inference Engine, each working with a domain-specific ontology. Two or more agents working with a shared ontology may collaborate to deduce answers to questions. Such collaborating agents may be based on differently designed Inference Engines and they would still be able to collaborate.

Proofs and Answers

The interesting thing about Info Agents that I did not clarify in the original post is that they will be capable of not only deducing answers from existing information (i.e. generating new information [and gaining knowledge in the process, for those agents with a learning function]) but they will also be able to formally test propositions (represented in some query logic) that are made directly or implied by the user. For example, instead of the example I gave previously (in the Wikipedia 3.0 article) where the user asks “Where is the nearest restaurant that serves Italian cuisine” and the machine deduces that a pizza restaurant serves Italian cuisine, the user may ask “Is the moon blue?” or say that the “moon is blue” to get a true or false answer from the machine. In this case, a simple Info Agent may answer with “No” but a more sophisticated one may say “the moon is not blue but some humans are fond of saying ‘once in a blue moon’ which seems illogical to me.”

This test-of-truth feature assumes the use of an ontology language (as a formal logic system) and an ontology where all propositions (or formal statements) that can be made can be computed (i.e. proved true or false) and were all such computations are decidable in finite time. The language may be OWL-DL or any language that, together with the ontology in question, satisfy the completeness and decidability conditions.

P2P 3.0 vs Google

If you think of how many processes currently run on all the computers and devices connected to the Internet then that should give you an idea of how many Info Agents can be running at once (as of today), all reasoning collaboratively across the different domains of human knowledge, processing and reasoning about heaps of information, deducing answers and deciding truthfulness or falsehood of user-stated or system-generated propositions.

Web 3.0 will bring with it a shift from centralized search engines to P2P Semantic Web Inference Engines, which will collectively have vastly more deductive power, in both quality and quantity, than Google can ever have (included in this exclusion is any future AI-enabled version of Google, as it will not be able to keep up with the distributed P2P AI matrix that will be enabled by millions of users running free P2P Semantic Web Inference Engine software on their home PCs.)

Thus, P2P Semantic Web Inference Engines will pose a huge and escalating threat to Google and other search engines and will expectedly do to them what P2P file sharing and BitTorrent did to FTP (central-server file transfer) and centralized file hosting in general (e.g. Amazon’s S3 use of BitTorrent.)

In other words, the coming of P2P Semantic Web Inference Engines, as an integral part of the still-emerging Web 3.0, will threaten to wipe out Google and other existing search engines. It’s hard to imagine how any one company could compete with 2 billion Web users (and counting), all of whom are potential users of the disruptive P2P model described here.

“The Future Has Arrived But It’s Not Evenly Distributed”

Currently, Semantic Web (aka Web 3.0) researchers are working out the technology and human resource issues and folks like Tim Berners-Lee, the Noble prize recipient and father of the Web, are battling critics and enlightening minds about the coming human-machine revolution.

The Semantic Web (aka Web 3.0) has already arrived, and Inference Engines are working with prototypical ontologies, but this effort is a massive one, which is why I was suggesting that its most likely enabler will be a social, collaborative movement such as Wikipedia, which has the human resources (in the form of the thousands of knowledgeable volunteers) to help create the ontologies (most likely as informal ontologies based on semantic annotations) that, when combined with inference rules for each domain of knowledge and the query structures for the particular schema, enable deductive reasoning at the machine level.

Addendum

On AI and Natural Language Processing

I believe that the first generation of AI that will be used by Web 3.0 (aka Semantic Web) will be based on relatively simple inference engines (employing both algorithmic and heuristic approaches) that will not attempt to perform natural language processing. However, they will still have the formal deductive reasoning capabilities described earlier in this article.

Related

  1. Wikipedia 3.0: The End of Google?
  2. Intelligence (Not Content) is King in Web 3.0
  3. Get Your DBin
  4. All About Web 3.0

Posted by Marc Fawzi

Enjoyed this analysis? You may share it with others on:

digg.png newsvine.png nowpublic.jpg reddit.png blinkbits.png co.mments.gif stumbleupon.png webride.gif del.icio.us

Read Full Post »

Evolving Trends

June 11, 2006

P2P Semantic Web Engines

No Comments »

Read Full Post »

  • My Dashboard
  • New Post
  • Evolving Trends

    June 30, 2006

    Web 3.0: Basic Concepts

    /*(this post was last updated at 1:20pm EST, July 19, ‘06)

    You may also wish to see Wikipedia 3.0: The End of Google? (The original ‘Web 3.0/Semantic Web’ article) and P2P 3.0: The People’s Google (a more extensive version of this article showing the implication of P2P Semantic Web Engines to Google.)

    Web 3.0 Developers:

    Feb 5, ‘07: The following reference should provide some context regarding the use of rule-based inference engines and ontologies in implementing the Semantic Web + AI vision (aka Web 3.0) but there are better, simpler ways of doing it. 

    1. Description Logic Programs: Combining Logic Programs with Description Logic

    */

    Basic Web 3.0 Concepts

    Knowledge domains

    A knowledge domain is something like Physics, Chemistry, Biology, Politics, the Web, Sociology, Psychology, History, etc. There can be many sub-domains under each domain each having their own sub-domains and so on.

    Information vs Knowledge

    To a machine, knowledge is comprehended information (aka new information produced through the application of deductive reasoning to exiting information). To a machine, information is only data, until it is processed and comprehended.

    Ontologies

    For each domain of human knowledge, an ontology must be constructed, partly by hand [or rather by brain] and partly with the aid of automation tools.

    Ontologies are not knowledge nor are they information. They are meta-information. In other words, ontologies are information about information. In the context of the Semantic Web, they encode, using an ontology language, the relationships between the various terms within the information. Those relationships, which may be thought of as the axioms (basic assumptions), together with the rules governing the inference process, both enable as well as constrain the interpretation (and well-formed use) of those terms by the Info Agents to reason new conclusions based on existing information, i.e. to think. In other words, theorems (formal deductive propositions that are provable based on the axioms and the rules of inference) may be generated by the software, thus allowing formal deductive reasoning at the machine level. And given that an ontology, as described here, is a statement of Logic Theory, two or more independent Info Agents processing the same domain-specific ontology will be able to collaborate and deduce an answer to a query, without being driven by the same software.

    Inference Engines

    In the context of Web 3.0, Inference engines will be combining the latest innovations from the artificial intelligence (AI) field together with domain-specific ontologies (created as formal or informal ontologies by, say, Wikipedia, as well as others), domain inference rules, and query structures to enable deductive reasoning on the machine level.

    Info Agents

    Info Agents are instances of an Inference Engine, each working with a domain-specific ontology. Two or more agents working with a shared ontology may collaborate to deduce answers to questions. Such collaborating agents may be based on differently designed Inference Engines and they would still be able to collaborate.

    Proofs and Answers

    The interesting thing about Info Agents that I did not clarify in the original post is that they will be capable of not only deducing answers from existing information (i.e. generating new information [and gaining knowledge in the process, for those agents with a learning function]) but they will also be able to formally test propositions (represented in some query logic) that are made directly or implied by the user. For example, instead of the example I gave previously (in the Wikipedia 3.0 article) where the user asks “Where is the nearest restaurant that serves Italian cuisine” and the machine deduces that a pizza restaurant serves Italian cuisine, the user may ask “Is the moon blue?” or say that the “moon is blue” to get a true or false answer from the machine. In this case, a simple Info Agent may answer with “No” but a more sophisticated one may say “the moon is not blue but some humans are fond of saying ‘once in a blue moon’ which seems illogical to me.”

    This test-of-truth feature assumes the use of an ontology language (as a formal logic system) and an ontology where all propositions (or formal statements) that can be made can be computed (i.e. proved true or false) and were all such computations are decidable in finite time. The language may be OWL-DL or any language that, together with the ontology in question, satisfy the completeness and decidability conditions.

    “The Future Has Arrived But It’s Not Evenly Distributed”

    Currently, Semantic Web (aka Web 3.0) researchers are working out the technology and human resource issues and folks like Tim Berners-Lee, the Noble prize recipient and father of the Web, are battling critics and enlightening minds about the coming human-machine revolution.

    The Semantic Web (aka Web 3.0) has already arrived, and Inference Engines are working with prototypical ontologies, but this effort is a massive one, which is why I was suggesting that its most likely enabler will be a social, collaborative movement such as Wikipedia, which has the human resources (in the form of the thousands of knowledgeable volunteers) to help create the ontologies (most likely as informal ontologies based on semantic annotations) that, when combined with inference rules for each domain of knowledge and the query structures for the particular schema, enable deductive reasoning at the machine level.

    Addendum

    On AI and Natural Language Processing

    I believe that the first generation of artficial intelligence (AI) that will be used by Web 3.0 (aka Semantic Web) will be based on relatively simple inference engines (employing both algorithmic and heuristic approaches) that will not attempt to perform natural language processing. However, they will still have the formal deductive reasoning capabilities described earlier in this article.

    Related

    1. Wikipedia 3.0: The End of Google?
    2. P2P 3.0: The People’s Google
    3. All About Web 3.0
    4. Semantic MediaWiki
    5. Get Your DBin

    Posted by Marc Fawzi

    Enjoyed this analysis? You may share it with others on:

    digg.png newsvine.png nowpublic.jpg reddit.png blinkbits.png co.mments.gif stumbleupon.png webride.gif del.icio.us

    Read Full Post »

    Evolving Trends

    July 12, 2006

    Semantic MediaWiki

    Filed under: Semantic MediaWiki, Semantic Web, SemanticWeb, Web 3.0, Wikipedia 3.0, ontology, ontoworld — evolvingtrends @ 6:01 am
    What is it? Semantic MediaWiki is an ongoing open source project to develop a Semantic Wiki Engine.

    In other words, it is one of the impportant early innovations leading up to the Wikipedia 3.0 (Web 3.0) vision.

    • The porject and software is called “Semantic MediaWiki”
    • ontoworld.org is just one site using the technology
    • Wikipedia might become another site using the technology 
    • Some more sites using the technology are found here

    Related

    1. Wikipedia 3.0: The End of Google?
    2. Web 3.0: Basic Concepts
    3. P2P 3.0: The People’s Google
    4. Semantic MediaWiki project website

    Posted by Marc Fawzi

    Enjoyed this analysis? You may share it with others on:

    digg.png newsvine.png nowpublic.jpg reddit.png blinkbits.png co.mments.gif stumbleupon.png webride.gif del.icio.us

    Read Full Post »

    Evolving Trends

    July 12, 2006

    Wikipedia 3.0: El fin de Google (traducción)

    Wikipedia 3.0: El fin de Google (traducción)

    Translation kindly provided by Eric Rodriguez

    /*

    Desarrolladores: Este es el nuevo proyecto open source Semantic MediaWiki.

    Bloggers: Este post explica la curiosa historia sobre como este articulo alcanzó 33,000 lectores solo en las primeras 24 horas desde su publicación, a través de digg. Este post explica cuál es el problema con digg y la Web 2.0 y como solucionarlo.

    Relacionado:

    1. All About Web 3.0
    2. P2P 3.0: The People’s Google
    3. Google Dont Like Web 3.0 [sic]
    4. For Great Justice, Take Off Every Digg
    5. Reality as a Service (RaaS): The Case for GWorld
    6. From Mediocre to Visionary

    */

    por Marc Fawzi de Evolving Trends

    Versión española (por Eric Rodriguez de Toxicafunk)

    La Web Semántica (o Web 3.0) promete “organizar la información mundial” de una forma dramáticamente más lógica que lo que Google podría lograr con su diseño de motor actual. Esto es cierto desde el punto de vista de la comprensión por parte de las maquinas versus la humana. La Web Semántica requiere del uso de un lenguaje ontológico declarativo, como lo es OWL, para producir ontologías específicas de dominio que las máquinas pueden usar para razonar sobre la información y de esta forma alcanzar nuevas conclusiones, en lugar de simplemente buscar / encontrar palabras claves.

    Sin embargo, la Web Semántica, que se encuentra todavía en una etapa de desarrollo en la que los investigadores intentan definir que modelo es el mejor y cual tiene mayor usabilidad, requeriría la participación de miles de expertos en distintos campos por un periodo indefinido de tiempo para poder producir las ontologías específicas de dominio necesarias para su funcionamiento.

    Las maquinas (o más bien el razonamiento basado en maquinas, también conocido como Software IA o ‘agentes de información’) podrían entonces usar las laboriosas –mas no completamente manuales- ontologías elaboradas para construir una vista (o modelo formal) sobre como los términos individuales, en un determinado conjunto de información, se relacionan entre sí. Tales relaciones se pueden considerar como axiomas (premisas básicas), que junto con las reglas que gobiernan el proceso de inferencia permiten a la vez que limitan la interpretación (y el uso correctamente-formado) de dichos términos por parte de los agentes de información, para poder razonar nuevas conclusiones basándose en la información existente, es decir, pensar. En otras palabras, se podría usar software para generar teoremas (proposiciones formales demostrables basadas en axiomas y en las reglas de inferencia), permitiendo así el razonamiento deductivo formal a nivel de máquinas. Y dado que una ontología, tal como se describe aquí, se trata de un enunciado de Teoría Lógica, dos o más agentes de información procesando la misma ontología de un dominio específico serán capaces de colaborar y deducir la respuesta a una query (búsqueda o consulta a una base de datos), sin ser dirigidos por el mismo software.

    De esta forma, y como se ha establecido, en la Web Semántica los agentes basados en maquina (o un grupo colaborador de agentes) serán capaces de entender y usar la información traduciendo conceptos y deduciendo nueva información en lugar de simplemente encontrar palabras clave.

    Una vez que las máquinas puedan entender y usar la información, usando un lenguaje estándar de ontología, el mundo nuca volverá a ser el mismo. Será posible tener un agente de información (o varios) entre tu ‘fuerza laboral‘ virtual aumentada por IA, cada uno teniendo acceso a diferentes espacios de dominio especifico de comprensión y todos comunicándose entre si para formar una conciencia colectiva.

    Podrás pedirle a tu agente o agentes de información que te encuentre el restaurante más cercano de cocina Italiana, aunque el restaurante más cercano a ti se promocione como un sitio para Pizza y no como un restaurante Italiano. Pero este es solo un ejemplo muy simple del razonamiento deductivo que las máquinas serán capaces de hacer a partir de la información existente.

    Implicaciones mucho más sorprendentes se verán cuando se considere que cada área del conocimiento humano estará automáticamente al alcance del espacio de comprensión de tus agentes de información. Esto es debido a que cada agente se puede comunicar con otros agentes de información especializados en diferentes dominios de conocimiento para producir una conciencia colectiva (usando la metáfora Borg) que abarca todo el conocimiento humano. La “mente” colectiva de dichos agentes-como-el-Borg conformara la Maquina Definitiva de Respuestas, desplazando fácilmente a Google de esta posición, que no ocupa enteramente.

    El problema con la Web Semántica, aparte de que los investigadores siguen debatiendo sobre que diseño e implementación de modelo de lenguaje de ontología (y tecnologías asociadas) es el mejor y el más usable, es que tomaría a miles o incluso miles de miles de personas con vastos conocimientos muchos años trasladar el conocimiento humano a ontologías especificas de dominio.

    Sin embargo, si en algún punto tomáramos la comunidad Wikipedia y les facilitásemos las herramientas y los estándares adecuados con que trabajar (sean estos existentes o a desarrollar en el futuro), de forma que sea posible para individuos razonablemente capaces reducir el conocimiento humano en ontologías de dominios específicos, entonces el tiempo necesario para hacerlo se vería acortado a unos cuantos años o posiblemente dos

    El surgimiento de una Wikipedia 3.0 (en referencia a Web 3.0, nombre dado a la Web Semántica) basada en el modelo de la Web Semántica anunciaría el fin de Google como la Maquina Definitiva de Respuestas. Este sería remplazado por “WikiMind” (WikiMente) que no sería un simple motor de búsqueda como Google sino un verdadero Cerebro Global: un poderoso motor de inferencia de dominios, con un vasto conjunto de ontologías (a la Wikipedia 3.0) cubriendo todos los dominios de conocimiento humano, capaz de razonar y deducir las respuestas en lugar de simplemente arrojar cruda información mediante el desfasado concepto de motor de búsqueda.

    Notas
    Tras escribir el post original descubrí que la aplicación Wikipedia, también conocida como MeadiaWiki que no ha de confundirse con Wikipedia.org, ya ha sido usado para implementar ontologías. El nombre que han seleccionado es Ontoworld. Me parece que WikiMind o WikiBorg hubiera sido un nombre más atractivo, pero Ontoworld también me gusta, algo así como “y entonces descendió al mundo,” (1) ya que se puede tomar como una referencia a la mente global que un Ontoworld capacitado con la Web Semántica daría a lugar.

    En tan solo unos cuantos años la tecnología de motor e búsqueda que provee a Google casi todos sus ingresos/capital, seria obsoleta… A menos que tuvieran un contrato con Ontoworld que les permitiera conectarse a su base de datos de ontologías añadiendo así la capacidad de motor de inferencia a las búsquedas de Google.

    Pero lo mismo es cierto para Ask,com y MSN y Yahoo.

    A mi me encantaría ver más competencia en este campo, y no ver a Google o cualquier otra compañía establecerse como líder sobre los otros.

    La pregunta, usando términos Churchilianos, es si la combinación de Wikipedia con la Web Semántica significa el principio del fin para Google o el fin del principio. Obviamente, con miles de billones de dólares con dinero de sus inversionistas en juego, yo opinaría que es lo último. Sin embargo, si me gustaría ver que alguien los superase (lo cual es posible en mi opinión).

    (1) El autor hace referencia al juego de palabra que da el prefijo Onto de ontología que suena igual al adverbio unto en ingles. La frase original es “and it descended onto the world,”.

    Aclaración
    Favor observar que Ontoworld, que implementa actualmente las ontologías, se basa en la aplicación “Wikipedia” (también conocida como MediaWiki) que no es lo mismo que Wikipedia.org.

    Así mismo, espero que Wikipedia.org utilice su fuerza de trabajo de voluntarios para reducir la suma de conocimiento humano que se ha introducido en su base de datos a ontologías de dominio específico para la Web Semántica (Web 3.0) y por lo tanto, “Wikipedia 3.0”.

    Respuesta a Comentarios de los Lectores
    Mi argumento es que Wikipedia actualmente ya cuenta con los recursos de voluntarios para producir las ontologías para cada uno de los dominios de conocimiento que actualmente cubre y que la Web Semántica tanto necesita, mientras que Google no cuenta con tales recursos, por lo que dependería de Wikipedia.

    Las ontologías junto con toda la información de la Web, podrán ser accedidas por Google y los demás pero será Wikipedia quien quede a cargo de tales ontologías debido a que actualmente Wikipedia ya cubre una enorme cantidad de dominios de conocimiento y es ahí donde veo el cambio en el poder.

    Ni Google ni las otras compañías posee el recurso humano (los miles de voluntarios con que cuenta Wikipedia) necesario para crear las ontologías para todos los dominios de conocimiento que Wikipedia ya cubre. Wikipedia si cuenta con tales recursos y además esta posicionada de forma tal que puede hacer trabajo mejor y más efectivo que cualquier otro. Es difícil concebir como Google lograría crear dichas ontologías (que crecen constantemente tanto en numero como en tamaño) dado la cantidad de trabajo que se requiere. Wikipedia, en cambio, puede avanzar de forma mucho más rápida gracias a su masiva y dedicada fuerza de voluntarios expertos.

    Creo que la ventaja competitiva será para quien controle la creación de ontologías para el mayor numero de dominios de conocimiento (es decir, Wikipedia) y no para quien simplemente acceda a ellas (es decir, Google).

    Existen muchos dominios de conocimiento que Wikipedia todavía no cubre. En esto Google tendría una oportunidad pero solamente si las personas y organizaciones que producen la información hicieran también sus propias ontologías, tal que Google pudiera acceder a ellas a través de su futuro motor de Web Semántica. Soy de la opinión que esto será así en el futuro pero que sucederá poco a poco y que Wikipedia puede tener listas las ontologías para todos los dominios de conocimiento con que ya cuenta mucho más rápido además de contar con la enorme ventaja de que ellos estarían a cargo de esas ontologías (la capa básica para permitir la IA).

    Todavía no esta claro, por supuesto, si la combinación de Wikipedia con la Web Semántica anuncia el fin de Google o el fin del principio. Como ya mencioné en el artículo original. Me parece que es la última opción, y que la pregunta que titula de este post, bajo el presente contexto, es meramente retórica. Sin embargo, podría equivocarme en mi juicio y puede que Google de paso a Wikipedia como la maquina definitiva de respuestas mundial.

    Después de todo, Wikipedia cuenta con “nosotros”. Google no. Wikipedia deriva su de poder de “nosotros”. Google deriva su poder de su tecnología y su inflado precio de mercado. ¿Con quien contarías para cambiar el mundo?

    Respuesta a Preguntas Básicas por parte de los Lectores
    El lector divotdave formulá unas cuantas preguntas que me parecen de naturaleza básica (es decir, importante). Creo que más personas se estarán preguntando las mismas cuestiones por lo que las incluyo con sus respectivas respuestas.

    Pregunta:
    ¿Como distinguir entre buena y mala información? Como determinar que partes del conocimiento humano aceptar y que parte rechazar?

    Respuesta:
    No es necesario distinguir entre buena y mala información (que no ha de confundirse con bien-formada vs. mal-formada) si se utiliza una fuente de información confiable (con ontologías confiables asociadas). Es decir, si la información o conocimiento que se busca se puede derivar de Wikipedia 3.0, entonces se asume que la información es confiable.

    Sin embargo, con respecto a como conectar los puntos al devolver información o deducir respuestas del inmenso mar de información que va más allá de Wikipedia, entonces la pregunta se vuelve muy relevante. Como se podría distinguir la buena información de la mala de forma que se pueda producir buen conocimiento (es decir, comprender información o nueva información producida a través del razonamiento deductivo basado en la información existente).

    Pregunta:
    Quien, o qué según sea el caso, determina que información es irrelevante para mí como usuario final?

    Respuesta:
    Esta es una buena pregunta que debe ser respondida por los investigadores que trabajan en los motores IA para la Web 3.0.

    Será necesario hacer ciertas suposiciones sobre que es lo que se está preguntando. De la misma forma en que tuve que suponer ciertas cosas sobre lo que realmente me estabas preguntando al leer tu pregunta, también lo tendrán que hacer los motores IA, basados en un proceso cognitivo muy similar al nuestro, lo cual es tema para otro post, pero que ha sido estudiado por muchos investigadores IA.

    Pregunta:
    ¿Significa esto en última instancia que emergerá un todopoderoso* estándar al cual toda la humanidad tendrá que adherirse (por falta de información alternativa)?

    Respuesta:
    No existe la necesidad de un estándar, excepto referente al lenguaje en el que se escribirán las ontologías (es decir, OWL, OWL-DL. OWL Full, etc.). Los investigadores de la Web Semántica intentan determinar la mejor opción, y la más usable, tomando en consideración el desempeño humano y de las máquinas al construir y –exclusivamente en el último caso- interpretar dichas ontologías.

    Dos o más agentes de información que trabajen con la misma ontología especifica de dominio pero con diferente software (diferente motor IA) pueden colaborar entre ellos. El único estándar necesario es el lenguaje de la ontología y las herramientas asociadas de producción.

    Anexo

    Sobre IA y el Procesamiento del Lenguaje Natural

    Me parece que la primera generación de IA que será usada por la Web 3.0 (conocido como Web Semántica) estará basada en motores de inferencia relativamente simples (empleando enfoques tanto algorítmicos como heurísticas) que no intentarán ningún tipo de procesamiento de lenguaje natural. Sin embargo, si mantendrán las capacidades de razonamiento deductivo formal descritas en este articulo.

    Sobre el debate acerca de La Naturaleza y Definición de IA

    La introducción de la IA en el ciber-espacio se hará en primer lugar con motores de inferencia (usando algoritmos y heurística) que colaboren de manera similar al P2P y que utilicen ontologías estándar. La interacción paralela entre cientos de millones de Agentes IA ejecutándose dentro de motores P2P de IA en las PCs de los usuarios dará cabida al complejo comportamiento del futuro cerebro global.

    2 Comments »

    1. […] Acá un recorte directo de la traducción del articulo original. (perdí mucho tiempo tratando de entenderlo, se nota?) por Marc Fawzi de Evolving Trends […]Pingback by DxZone 2.0 (beta) – DxBlog » Blog Archive » Web 3.0? — August 7, 2006 @ 9:03 pm
    2. Es muy interesante. Creo que el artículo de Wikipedia sobre Web 2.0 complementa muy bien este trabajo:

      Bien podría hablarse de la Web 3.0 para la Web semántica. Pero una diferencia fundamental entre ambas versiones de web (2.0 y 3.0) es el tipo de participante. La 2.0 tiene como principal protagonista al usuario humano que escribe artículos en su blog o colabora en un wiki. El requisito es que además de publicar en HTML emita parte de sus aportaciones en XML/RDF (RSS, ATOM, etc.). La 3.0, sin embargo, está orientada hacia el protagonismo de procesadores mecánicos que entiendan de lógica descriptiva en OWL. La 3.0 está concebida para que las máquinas hagan el trabajo de las personas a la hora de procesar la avalancha de información publicada en la Web.

      La clave está aquí al final: la Web 3.0 será protagonizada por robots inteligentes y dispositivos ubícuos. De esto ya ha dicho algo O’Reilly.

      Desde luego estoy de acuerdo con el autor, la Wikipedia semántica será la bomba, pero me temo que será un subconjunto de la social o folcsonómica, porque la semántica tiene limitaciones. Debería explicar esto en algún artículo. Tal vez lo haga en las páginas de nuestro proyecto Wikiesfera, que para eso es más sexy un wiki que un blog. 😉

      Gracias por la traducción.

      Comment by Joseba — November 30, 2006 @ 1:19 am

    RSS feed for comments on this post. TrackBack URI

    Leave a comment

    Read Full Post »

    Older Posts »

    %d bloggers like this: