Feeds:
Posts
Comments

Posts Tagged ‘Semanticweb’

Evolving Trends

Web 3.0

Historically speaking, the first coining of Web 3.0 in conjunction with Semantic Web and/or AI agents and the first coining of  Web 3.0 in conjunction with Wikipedia and/or Google was made in the Wikipedia 3.0: The End of Google? article, which was published on Evolving Trends (this blog) on June 26, ‘06.

June 28, ‘06: Here’s what a fellow blogger, who had reviewed the Wikipedia 3.0 article, had to say:

“[…] But there it is. That was then. Now, it seems, the rage is Web 3.0. It all started
with this article here addressing the Semantic Web, the idea that a new organizational
structure for the web ought to be based on concepts that can be interpreted. The idea is
to help computers become learning machines, not just pattern matchers and calculators. […]“

June 28, ‘06: A fellow blogger wrote:

“This is the first non-sarcastic reference to Web 3.0 I’ve seen in the wild”

As of Jan 25, there are 11,000 links to Evolving Trends from blogs, forums and news sites pointing to the Wikipedia 3.0 article.

Jan 25, ‘07: A fellow blogger wrote:

“In 2004 I with my friend Aleem worked on idea of Semantic Web (as our senior project), and now I have been hearing news of Web 3.0. I decided to work on the idea further in 2005, and may be we could have made a very small scaled 4th generation search engine. Though this has never become reality but now it seems it’s hot time for putting Semantics and AI into web. Reading about Web 3.0 again thrilled me with the idea. [Wikia] has decided to jump into search engines and give Google a tough time :). So I hope may be I get a chance to become part of this Web 3.0 and make information retreival better.”

Alexa graph

According to Alexa the Wikipedia 3.0: The End of Google? article estimated penetration peaked on June 28 at a ratio of 650 per each 1,000,000 people. Based on an estimated number of 1,000,000,000 Web users, this means that it reached 650,000 people on June 28, and other hundreds of thousands of people on June 26, 27, 29, 30. This includes people who read the article at about 6,000 sites (according to MSN) that had linked to Evolving Trends. Based on the Alexa graph, we could estimate that the article reach close to 2 million people in the first 4.5 days of its release.

Update on Alexa Statistics (Sep. 18, 2008): some people have pointed out (independently with respect to their own experience) that Alexa’s statistics are skewed and not very reliable. As far as the direct hits to the on this blog they’re in the 200,000 range as of this writing.


Note: the term “Web 3.0″ is the dictionary word “Web” followed by the number “3″, a decimal point and the number “0.” As such, the term itself cannot and should not have any commercial significance in any context.  


Update on how the Wikipedia 3.0 vision is spreading:


Update on how Google is hopelessly attempting to co-opt the Wikipedia 3.0 vision:  


Web 3D + Semantic Web + AI as Web 3.0:  

Here is the original article that gave birth to the Web 3.0 vision:

3D Web + Semantic Web + AI *

The above mentioned Web 3D + Semantic Web + AI vision which preceded the Wikipeda 3.0 vision received much less attention because it was not presented in a controversial manner. This fact was noted as the biggest flaw of social bookmarking site digg which was used to promote this article.

Developers:

Feb 5, ‘07: The following external reference concerns the use of rule-based inference engines and ontologies in implementing the Semantic Web + AI vision (aka Web 3.0):

  1. Description Logic Programs: Combining Logic Programs with Description Logic (note: there are better, simpler ways of achieving the same purpose.)

Jan 7, ‘07: The following Evolving Trends post discussing current state of semantic search engines and ways to improve the design:

  1. Designing a Better Web 3.0 Search Engine

The idea described in this article was adopted by Hakia after it was published here, so this article may be considered as prior art.

June 27, ‘06: Semantic MediaWiki project, enabling the insertion of semantic annotations (or metadata) into Wikipedia content (This project is now hosted by Wikia, Wikipedia founder Jimmy wales’ private venture, and may benefit Wikia instead of Wikipedia, which is why I see it as a conflict of interest.)

Bloggers:

This post provides the history behind use of the term Web 3.0 in the context of the Semantic Web and AI.

This post explains the accidental way in which this article reaching 2 million people in 4 days.


Web 3.0 Articles on Evolving Trends

Noteworthy mentions of the Wikipedia 3.0 article:

Tags:

Semantic Web, Web strandards, Trends, OWL, Googleinference, inference engine, AI, ontology, Semanticweb, Web 2.0, Web 2.0, Web 3.0, Web 3.0, Wikipedia, Wikipedia 3.0Wikipedia AI, P2P 3.0, P2P AI, P2P Semantic Web inference Engineintelligent findability

Evolving Trends is Powered by +[||||]- 42V

Read Full Post »

Evolving Trends

July 2, 2006

Digg This! 55,500 hits in ~4 Days

/* (this post was last updated at 10:30am EST, July 3, ‘06, GMT +5)

This post is a follow up to the previous post For Great Justice, Take Off Every Digg

According to Alexa.com, the total penetration of the Wikipedia 3.0 article was ~2 million readers (who must have read it on other websites that copied the article)

*/

EDIT: I looked at the graph and did the math again, and as far as I can tell it’s “55,500 in ~4 days” not “55,000 in 5 days.” So that’s 13,875 page views per each day.

Stats (approx.) for the “Wikipedia 3.0: The End of Google?” and “For Great Justice, Take Off Every Digg articles:

These are to the best of my memory from each of the first ~4 days as verified by the graph.

33,000 page views in day 1 (the first wave)

* day 1 is almost one and a half columns on the graph not one because I posted it at ~5:00am and the day (in WordPress time zone) ends at 8pm, so the first column is only ~ 15 hours.

9,500 page views in day 2

5,000 page views in day 3

8,000 page views in day 4 (the second wave)

Total: 55,500 in ~4 days which is 13,875 page views per day (not server hits) for ~4 days. Now on the 7th day the traffic is expected to be ~1000 page views, unless I get another small spike. That’s a pretty good double-dipping long tail. If you’ve done better with digg let me know how you did it! 🙂

Experiment

This post is a follow-up to my previous article on digg, where I explained how I had experimented and succeeded in generating 45,000 visits to an article I wrote in the first 3 days of its release (40,000 of which came directly from digg.)

I had posted an article on digg about a bold but well-thought out vision of the future, involving Google and Wikipedia, with the sensational title of “Wikipedia 3.0: The End of Google?” (which may turn out after all to be a realistic proposition.)

Since my previous article on digg I’ve found out that digg did not ban my IP address. They had deleted my account due to multiple submissions. So I was able to get back with a new user account and try another the experiment: I submitted “AI Matrix vs Google” and “Web 3.0 vs Google” as two separate links for one article (which has since been given the final title of “Web 3.0.” [July 12, ‘06, update: see P2P 3.0: The People’s Google)

Results

Neither ’sensational’ title worked.

Analysis

I tried to rationalize what happened …

I figured that the crowd wanted a showdown between two major cults (e.g the Google fans and the Wikipedia fans) and not between Google and some hypothetical entity (e.g. AI Matrix or Web 3.0).

But then I thought about how Valleywag was able to cleverly piggyback on my “Wikipedia 3.0: The End of Google?” article (which had generated all the hype) with an article having the dual title of “Five Reasons Google Will Invent Real AI” on digg and “Five Reasons No One Will Replace Google” on Valleywag.

They used AI in the title and I did the same in the new experiment, so we should both get lots of diggs. They got about 1300 diggs. I got about 3. Why didn’t it work in my case?

The answer is that the crowd is not a logical animal. It’s a psychological animal. It does not make mental connections as we do as individuals (because a crowd is a randomized population that is made up of different people at different times) so it can’t react logically.

Analyzing it from the psychological frame, I concluded that it must have been the Wikipedia fans who “dugg” my original article. The Google fans did “digg” it but not in the same large percentage as the Wikipedia fans.

Valleywag gave the Google fans the relief they needed after my article with its own article in defense of Google. However, when I went at it again with “Matrix AI vs Google” and “Web 3.0 vs Google” the error I made was in not knowing that the part of the crowd that “dugg” my original article were the Wikipedia fans not the Goolge haters. In fact, Google haters are not very well represented on digg. In other words, I found out that “XYZ vs Google” will not work on digg unless XYZ has a large base of fans on digg.

Escape Velocity

The critical threshold in the digg traffic generation process is to get enough diggs quickly enough, after submitting the post, to get the post on digg’s popular page. Once the post is on digg’s popular page both sides (those who like what your post is about and those who will hate you and want to kill you for writing it) will affected by the psychlogical manipulation you design (aka the ‘wave.’) However, the majority of those who will “digg” it will be from the group that likes it. A lesser number of people will “digg” it from the group that hates it.

Double Dipping

I did have a strong second wave when I went out and explained how ridiculous the whole digg process is.

This is how the second wave was created:

I got lots of “diggs” from Wikipedia fans and traffic from both Google and Wikipedia fans for the original article.

Then I wrote a follow up on why “digg sucks” but only got 100 “diggs” for it (because all the digg fans on digg kept ‘burying’ it!) so I did not get much traffic to it from digg fans or digg haters (not that many of the latter on digg.)

The biggest traffic to it came from the bloggers and others who came to see what the all fuss was about as far as the original article. I had linked to the follow up article (on why I thought digg sucked) from the original article (i.e. like chaining magnets) so when people came to see what the fuss was all about with respect to the original article they were also told to check out the “digg sucks” article for context.

That worked! The original and second waves, which both had a long tail (see below) generated a total of 55,500 hits in ~4 days. That’s 13,875 page views a day for the first ~4 days.

Long Tail vs Sting

I know that some very observant bloggers have said that digg can only produce a sharp, short lived pulse of traffic (or a sting), as opposed to a long tail or a double-dipping long tail, as in my case, but those observations are for posts that are not themselves memes. When you have a meme you get the long tail (or an exponential decay) and when you chain memes as I did (which I guess I could have done faster as the second wave would have been much bigger) then you get a double-dipping long tail as I’m having now.

Today (which is 7 days after the original experiment) the traffic is over 800 hits so far, still on the strength of the original wave and the second wave (note that the flat like I had before the spike represents levels of traffic between ~100 to ~800, so don’t be fooled by the flatness, it’s relative to the scale of the graph.)

In other words, traffic is still going strong from the strength of the long-tail waves generated from the original post and the follow up one.

double

Links

  1. Wikipedia 3.0: The End of Google?
  2. For Great Justice, Take Off Every Digg
  3. Unwisdom of Crowds
  4. Self-Aware e-Society

Posted by Marc Fawzi

Tags:
Semantic Web, Web strandards, Trends, wisdom of crowds, tagging, Startup, mass psychology, Google, cult psychology, inference, inference engine, AI, ontology, Semanticweb, Web 2.0, Web 2.0, Web 3.0, Web 3.0, Google Base, artificial intelligence, AI, Wikipedia, Wikipedia 3.0, collective consciousness, digg, censorship

15 Comments »

  1. Update this in two weeks, after a Friday, Saturday, and Sunday, and a holiday in the middle of the week in the United States which means a lot of people are on vacation, and another weekend, and see what happens with traffic trends, including Digg related traffic. And check out my unscientific reseach on when the best time and day to post is on your blog, and compare what you find over the course of time, not just a couple days. I’m curious how days of the week and the informal research I did might reflect within your information. That will REALLY help us see the reality of your success.Still, you’ve gathered a ton of fabulous information. I found it interesting that the post title on your Digg sucks article kept changing every hour or so on the WordPress.com top lists. I think it was “Power of the Schwartz” that really caught my eye. 😉

    I wish you could check out how much traffic came from WordPress.com dashboards and top blog listing comparatively to Digg traffic results, as well as all the other social bookmarking sources which pick up Digg posts, and compare that information as to how directly your traffic was related solely to Digg. It was in the first place, but “then” what happened.

    There is a lot of whack things that go into driving traffic, and I also know that WordPress.com’s built in traffic charts don’t match up exactly and consistently with some of the external traffic reports I’ve checked for my WordPress.com blog, so only time will tell, and this will get more and more interesting as time goes on.

    Good work!

    Comment by Lorelle VanFossen — July 2, 2006 @ 11:19 am

  2. Yeah I caught myself saying “Merchandising Merchandising Merchandising” the other day!:)

    Well I noticed about 1000, 800, 600, 500 hits (in this order) from WordPress for those 4 days …

    Valleywag sent me about 12,000 (in total)

    Marc

    Comment by evolvingtrends — July 2, 2006 @ 11:26 am

  3. Great analysis on digg. It looks like digg or the memes can be somewhat influenced and analyzed. It’s almost like psycho analyzing a strange new brain.I find it very interesting how this all happened. Even if digg gave you a short pulse for a few days, it generated augmented daily traffic until now. I wouldn’t be surprised that new readers discovered you this way. The whole applications of traffic and readers are very fluid in nature. I wonder if they could be mapped in some way of form through fluid dynamics.

    Cheers

    Comment by range — July 3, 2006 @ 1:39 am

  4. It’s highly multi-disciplinary. It can be conquered but not as fast as you or I would like.This is like analyzing a strange new brain … a brain that is influenced greatly by everything except logic.

    I plan on analyzing it in the open for a long time to come, so stick around and add your thoughts to it. 🙂
    They say ‘observing something changes its outcome’ .. So we’ll see how it goes.

    Cheers,

    Marc

    Comment by evolvingtrends — July 3, 2006 @ 2:36 am

  5. […] 1. Digg This! 55,500 Hits in ~4 Days […]Pingback by Evolving Trends » Global Brain vs Google — July 3, 2006 @ 10:37 am
  6. […] This article has a follow-up part: Digg This! 55,500 Hits in ~4 Days […]Pingback by Evolving Trends » For Great Justice, Take Off Every Digg — July 3, 2006 @ 10:57 am
  7. Marc,I don’t know if this information helps or skews your research, but a post I wrote in January, titled to get Digg and other traffic attention, Horse Sex and What is Dictating Your Blog’s Content, did not do well at all. That is until the past three days.

    It’s really started piling up a lot of hits, sitting in the top 10 of my top posts, outreaching the other posts that get consistently high traffic by a huge margin. Until Saturday, that post was not even in the top 50 or 75. I can’t tell where the traffic is suddenly coming from, as WordPress.com doesn’t offer that kind of specific information, and I’m not getting any outstanding traffic from any single source. Nothing from Digg, but something is suddenly driving that post through the roof. Even during a holiday week in the US! Very strange.

    Maybe there’s a new fad in horse sex lately – who knows? 😉

    Still, the point is that this was written in January, and now it is getting attention in July. I’ll be checking to find out what is causing the sudden flush of traffic, but never doubt that your posts are ageless in many respects. So the long term study of Digg patterns and traffic will help all of us over the “long haul”. That’s why I’m really curious about the long term effects of your posts.

    Sometimes you just can’t predict the crowds. 😉 Or what they will suddenly be interested in. I’ve written so many posts and titles that I was sure would skyrocket traffic, only to lay there like empty beer bottles in the playground. Totally useless. And others with sloppy titles and written quickly with little attention to detail skyrocketing like 1000 bottles of coke filled with Mentos. 😉 It’s an interesting process, isn’t it?

    Comment by Lorelle VanFossen — July 3, 2006 @ 9:37 pm

  8. Predicting the weather for the long term is not currently feasible. However, predicting the weather for the short term is (1-2 days in davance.)But it’s not all about ‘predicting’ … It’s about studying the phenomenon so that we can make better choices to reduce the effect of uncertainty and not try to eliminate uncertainty.

    Marc

    Comment by evolvingtrends — July 4, 2006 @ 12:02 am

  9. I think then that the obvious question is why you’ve done nothing to monetize those hits, however fickle they might be!;)

    Comment by Sam Jackson — July 4, 2006 @ 4:42 pm

  10. Monetize, Monetize, Monetize!Fortunately, that won’t happen 🙂

    Marc

    Comment by evolvingtrends — July 4, 2006 @ 8:28 pm

  11. […] 4 – Digg This! 55,500 hits in ~4 Days A blogger explains how he ‘milked’ Digg for a major spike in traffic. Meme engineering in action; fascinating stuff. (tags: Wikipedia Google visits article post tail long spike scam traffic blogging blog meme Digg) […]Pingback by Velcro City Tourist Board » Blog Archive » Links for 05-07-2006 — July 4, 2006 @ 10:20 pm
  12. Since web traffic is dictated by humans and engines and not by some exterior force like the weather, I think that there are a lot of possible venues of analysis of it. The only thing is that the flow and traffic needs to be documented. In most cases, the traffic might be, but there lacks information on past flow. The internet is concentrated on the now and less with what happened ten days ago on this site and such.Mathematical Fluid dynamics are probably the way to go, though even if I am a mathematician, I’d have to research it a bit before pronouncing myself completely. These types of analysis can get quite complicated because of the implications of partial differential equations of an order higher than 2, which can not be solved only approximated numerically.

    I’m sure I’m not the only one to say this, but I like the types of discussions and content that you put forward, it gets the mind thinking on certain subjects that most of the time users tend to accept without question.

    Comment by range — July 4, 2006 @ 10:54 pm

  13. “the implications of partial differential equations of an order higher than 2, which can not be solved only approximated numerically.”Have you looked into Meyer’s methods of “invariant embedding” …? to convert PDEs to a set of ordinary differential equations then solve?

    I believe the investigation of hype management is extremely multi-disciplinary and very much like the weather. That means that while it’s deterministic (as everything is in essence with the exception of non-causal quantum theory) it’s still highly unstable and ultimately hard [in computationl terms] to predict.

    In general, uncertainty exists in every system, including maths itself (because of lack of absolute consistency and incompleteness), so while you can’t eliminate it you can hope to reduce it.

    But in practical terms, what I’m looking to do is to simply gain a sufficient minimum in insight to allow me to improve my chances at generating and surfing hype waves… I believe I will end up applying a non-formal theory such as framing theory to transform the problem from the computational domain to the cognitive domain (so I may use that 90% of the brain that we supposedly don’t use to carry out the computation with my own internal computational model.)

    Clarity, in simple terms, is what it’s all about.

    However, to reach clarity’s peak you have to climb a mountain of complexity 🙂

    Marc

    Comment by evolvingtrends — July 4, 2006 @ 11:10 pm

  14. Hey Marc!I now know what it feels like to be caught in a digg like wave. Right now, I have had over 141000 page views because of a post that I did this morning, explaining HDR photography.

    Since digg banned my url for some reason (I don’t know why, I haven’t posted anything to digg in the last 2 months), this was all done through del.icio.us, Reddit and Popurls. It’s like one thing leads to another. I have added an url giving a surface analysis of this situation.

    http://range.wordpress.com/2006/07/15/how-the-memoirs-got-127000-hits-in-a-few-hours-or-a-follow-up-post-to-modern-hdr-photography/

    Naturally, I find myself compelled to continue writing on the subject. I have already posted a follow-up article and I am working on another one right now. I knew I had a spike on weekends, nothing like this however.

    Comment by range — July 15, 2006 @ 7:29 pm

  15. Hey Marc.I think the main reason why I didn’t get any higher was because of the stat problem that WP has been having over the last few days.

    I hope they save this traffic so that I have some nice graphs to show you. They probably do. It felt like the counter was accurate, I checked out that I did indeed make onto a few memediggers, still am right now.

    And also the stat page was just so slow to catch up with the amount of traffic that was generated. WP couldn’t keep up.

    Hopefully, they will sort it out over the next few days. I think it was most surprising in the afternoon. I kept refreshing the counter, and oups, a few thousand here, ten thousand there. I was really surprised. And I have also started getting some haters, as you must know, with the good comes the bad.

    Comment by range — July 15, 2006 @ 8:49 pm

Read Full Post »

Evolving Trends

July 11, 2006

P2P 3.0: The People’s Google

/*

This is a more extensive version of the Web 3.0 article with extra sections about the implications of Web 3.0 to Google.

See this follow up article  for the more disruptive ‘decentralized kowledgebase’ version of the model discussed in article.

Also see this non-Web3.0 version: P2P to Destroy Google, Yahoo, eBay et al 

Web 3.0 Developers:

Feb 5, ‘07: The following reference should provide some context regarding the use of rule-based inference engines and ontologies in implementing the Semantic Web + AI vision (aka Web 3.0) but there are better, simpler ways of doing it. 

  1. Description Logic Programs: Combining Logic Programs with Description Logic

*/

In Web 3.0 (aka Semantic Web) P2P Inference Engines running on millions of users’ PCs and working with standardized domain-specific ontologies (created by Wikipedia, Ontoworld, other organizations or individuals) using Semantic Web tools, including Semantic MediaWiki, will produce an infomration infrastructure far more powerful than Google (or any current search engine.)

The availability of standardized ontologies that are being created by people, organizations, swarms, smart mobs, e-societies, etc, and the near-future availability of P2P Semantic Web Inference Engines that work with those ontologies means that we will be able to build an intelligent, decentralized, “P2P” version of Google.

Thus, the emergence of P2P Inference Engines and domain-specific ontologies in Web 3.0 (aka Semantic Web) will present a major threat to the central “search” engine model.

Basic Web 3.0 Concepts

Knowledge domains

A knowledge domain is something like Physics, Chemistry, Biology, Politics, the Web, Sociology, Psychology, History, etc. There can be many sub-domains under each domain each having their own sub-domains and so on.

Information vs Knowledge

To a machine, knowledge is comprehended information (aka new information produced through the application of deductive reasoning to exiting information). To a machine, information is only data, until it is processed and comprehended.

Ontologies

For each domain of human knowledge, an ontology must be constructed, partly by hand [or rather by brain] and partly with the aid of automation tools.

Ontologies are not knowledge nor are they information. They are meta-information. In other words, ontologies are information about information. In the context of the Semantic Web, they encode, using an ontology language, the relationships between the various terms within the information. Those relationships, which may be thought of as the axioms (basic assumptions), together with the rules governing the inference process, both enable as well as constrain the interpretation (and well-formed use) of those terms by the Info Agents to reason new conclusions based on existing information, i.e. to think. In other words, theorems (formal deductive propositions that are provable based on the axioms and the rules of inference) may be generated by the software, thus allowing formal deductive reasoning at the machine level. And given that an ontology, as described here, is a statement of Logic Theory, two or more independent Info Agents processing the same domain-specific ontology will be able to collaborate and deduce an answer to a query, without being driven by the same software.

Inference Engines

In the context of Web 3.0, Inference engines will be combining the latest innovations from the artificial intelligence (AI) field together with domain-specific ontologies (created as formal or informal ontologies by, say, Wikipedia, as well as others), domain inference rules, and query structures to enable deductive reasoning on the machine level.

Info Agents

Info Agents are instances of an Inference Engine, each working with a domain-specific ontology. Two or more agents working with a shared ontology may collaborate to deduce answers to questions. Such collaborating agents may be based on differently designed Inference Engines and they would still be able to collaborate.

Proofs and Answers

The interesting thing about Info Agents that I did not clarify in the original post is that they will be capable of not only deducing answers from existing information (i.e. generating new information [and gaining knowledge in the process, for those agents with a learning function]) but they will also be able to formally test propositions (represented in some query logic) that are made directly or implied by the user. For example, instead of the example I gave previously (in the Wikipedia 3.0 article) where the user asks “Where is the nearest restaurant that serves Italian cuisine” and the machine deduces that a pizza restaurant serves Italian cuisine, the user may ask “Is the moon blue?” or say that the “moon is blue” to get a true or false answer from the machine. In this case, a simple Info Agent may answer with “No” but a more sophisticated one may say “the moon is not blue but some humans are fond of saying ‘once in a blue moon’ which seems illogical to me.”

This test-of-truth feature assumes the use of an ontology language (as a formal logic system) and an ontology where all propositions (or formal statements) that can be made can be computed (i.e. proved true or false) and were all such computations are decidable in finite time. The language may be OWL-DL or any language that, together with the ontology in question, satisfy the completeness and decidability conditions.

P2P 3.0 vs Google

If you think of how many processes currently run on all the computers and devices connected to the Internet then that should give you an idea of how many Info Agents can be running at once (as of today), all reasoning collaboratively across the different domains of human knowledge, processing and reasoning about heaps of information, deducing answers and deciding truthfulness or falsehood of user-stated or system-generated propositions.

Web 3.0 will bring with it a shift from centralized search engines to P2P Semantic Web Inference Engines, which will collectively have vastly more deductive power, in both quality and quantity, than Google can ever have (included in this exclusion is any future AI-enabled version of Google, as it will not be able to keep up with the distributed P2P AI matrix that will be enabled by millions of users running free P2P Semantic Web Inference Engine software on their home PCs.)

Thus, P2P Semantic Web Inference Engines will pose a huge and escalating threat to Google and other search engines and will expectedly do to them what P2P file sharing and BitTorrent did to FTP (central-server file transfer) and centralized file hosting in general (e.g. Amazon’s S3 use of BitTorrent.)

In other words, the coming of P2P Semantic Web Inference Engines, as an integral part of the still-emerging Web 3.0, will threaten to wipe out Google and other existing search engines. It’s hard to imagine how any one company could compete with 2 billion Web users (and counting), all of whom are potential users of the disruptive P2P model described here.

“The Future Has Arrived But It’s Not Evenly Distributed”

Currently, Semantic Web (aka Web 3.0) researchers are working out the technology and human resource issues and folks like Tim Berners-Lee, the Noble prize recipient and father of the Web, are battling critics and enlightening minds about the coming human-machine revolution.

The Semantic Web (aka Web 3.0) has already arrived, and Inference Engines are working with prototypical ontologies, but this effort is a massive one, which is why I was suggesting that its most likely enabler will be a social, collaborative movement such as Wikipedia, which has the human resources (in the form of the thousands of knowledgeable volunteers) to help create the ontologies (most likely as informal ontologies based on semantic annotations) that, when combined with inference rules for each domain of knowledge and the query structures for the particular schema, enable deductive reasoning at the machine level.

Addendum

On AI and Natural Language Processing

I believe that the first generation of AI that will be used by Web 3.0 (aka Semantic Web) will be based on relatively simple inference engines (employing both algorithmic and heuristic approaches) that will not attempt to perform natural language processing. However, they will still have the formal deductive reasoning capabilities described earlier in this article.

Related

  1. Wikipedia 3.0: The End of Google?
  2. Intelligence (Not Content) is King in Web 3.0
  3. Get Your DBin
  4. All About Web 3.0

Posted by Marc Fawzi

Enjoyed this analysis? You may share it with others on:

digg.png newsvine.png nowpublic.jpg reddit.png blinkbits.png co.mments.gif stumbleupon.png webride.gif del.icio.us

Read Full Post »

Evolving Trends

June 11, 2006

P2P Semantic Web Engines

No Comments »

Read Full Post »

  • My Dashboard
  • New Post
  • Evolving Trends

    June 30, 2006

    Web 3.0: Basic Concepts

    /*(this post was last updated at 1:20pm EST, July 19, ‘06)

    You may also wish to see Wikipedia 3.0: The End of Google? (The original ‘Web 3.0/Semantic Web’ article) and P2P 3.0: The People’s Google (a more extensive version of this article showing the implication of P2P Semantic Web Engines to Google.)

    Web 3.0 Developers:

    Feb 5, ‘07: The following reference should provide some context regarding the use of rule-based inference engines and ontologies in implementing the Semantic Web + AI vision (aka Web 3.0) but there are better, simpler ways of doing it. 

    1. Description Logic Programs: Combining Logic Programs with Description Logic

    */

    Basic Web 3.0 Concepts

    Knowledge domains

    A knowledge domain is something like Physics, Chemistry, Biology, Politics, the Web, Sociology, Psychology, History, etc. There can be many sub-domains under each domain each having their own sub-domains and so on.

    Information vs Knowledge

    To a machine, knowledge is comprehended information (aka new information produced through the application of deductive reasoning to exiting information). To a machine, information is only data, until it is processed and comprehended.

    Ontologies

    For each domain of human knowledge, an ontology must be constructed, partly by hand [or rather by brain] and partly with the aid of automation tools.

    Ontologies are not knowledge nor are they information. They are meta-information. In other words, ontologies are information about information. In the context of the Semantic Web, they encode, using an ontology language, the relationships between the various terms within the information. Those relationships, which may be thought of as the axioms (basic assumptions), together with the rules governing the inference process, both enable as well as constrain the interpretation (and well-formed use) of those terms by the Info Agents to reason new conclusions based on existing information, i.e. to think. In other words, theorems (formal deductive propositions that are provable based on the axioms and the rules of inference) may be generated by the software, thus allowing formal deductive reasoning at the machine level. And given that an ontology, as described here, is a statement of Logic Theory, two or more independent Info Agents processing the same domain-specific ontology will be able to collaborate and deduce an answer to a query, without being driven by the same software.

    Inference Engines

    In the context of Web 3.0, Inference engines will be combining the latest innovations from the artificial intelligence (AI) field together with domain-specific ontologies (created as formal or informal ontologies by, say, Wikipedia, as well as others), domain inference rules, and query structures to enable deductive reasoning on the machine level.

    Info Agents

    Info Agents are instances of an Inference Engine, each working with a domain-specific ontology. Two or more agents working with a shared ontology may collaborate to deduce answers to questions. Such collaborating agents may be based on differently designed Inference Engines and they would still be able to collaborate.

    Proofs and Answers

    The interesting thing about Info Agents that I did not clarify in the original post is that they will be capable of not only deducing answers from existing information (i.e. generating new information [and gaining knowledge in the process, for those agents with a learning function]) but they will also be able to formally test propositions (represented in some query logic) that are made directly or implied by the user. For example, instead of the example I gave previously (in the Wikipedia 3.0 article) where the user asks “Where is the nearest restaurant that serves Italian cuisine” and the machine deduces that a pizza restaurant serves Italian cuisine, the user may ask “Is the moon blue?” or say that the “moon is blue” to get a true or false answer from the machine. In this case, a simple Info Agent may answer with “No” but a more sophisticated one may say “the moon is not blue but some humans are fond of saying ‘once in a blue moon’ which seems illogical to me.”

    This test-of-truth feature assumes the use of an ontology language (as a formal logic system) and an ontology where all propositions (or formal statements) that can be made can be computed (i.e. proved true or false) and were all such computations are decidable in finite time. The language may be OWL-DL or any language that, together with the ontology in question, satisfy the completeness and decidability conditions.

    “The Future Has Arrived But It’s Not Evenly Distributed”

    Currently, Semantic Web (aka Web 3.0) researchers are working out the technology and human resource issues and folks like Tim Berners-Lee, the Noble prize recipient and father of the Web, are battling critics and enlightening minds about the coming human-machine revolution.

    The Semantic Web (aka Web 3.0) has already arrived, and Inference Engines are working with prototypical ontologies, but this effort is a massive one, which is why I was suggesting that its most likely enabler will be a social, collaborative movement such as Wikipedia, which has the human resources (in the form of the thousands of knowledgeable volunteers) to help create the ontologies (most likely as informal ontologies based on semantic annotations) that, when combined with inference rules for each domain of knowledge and the query structures for the particular schema, enable deductive reasoning at the machine level.

    Addendum

    On AI and Natural Language Processing

    I believe that the first generation of artficial intelligence (AI) that will be used by Web 3.0 (aka Semantic Web) will be based on relatively simple inference engines (employing both algorithmic and heuristic approaches) that will not attempt to perform natural language processing. However, they will still have the formal deductive reasoning capabilities described earlier in this article.

    Related

    1. Wikipedia 3.0: The End of Google?
    2. P2P 3.0: The People’s Google
    3. All About Web 3.0
    4. Semantic MediaWiki
    5. Get Your DBin

    Posted by Marc Fawzi

    Enjoyed this analysis? You may share it with others on:

    digg.png newsvine.png nowpublic.jpg reddit.png blinkbits.png co.mments.gif stumbleupon.png webride.gif del.icio.us

    Read Full Post »

    Evolving Trends

    July 12, 2006

    Semantic MediaWiki

    Filed under: Semantic MediaWiki, Semantic Web, SemanticWeb, Web 3.0, Wikipedia 3.0, ontology, ontoworld — evolvingtrends @ 6:01 am
    What is it? Semantic MediaWiki is an ongoing open source project to develop a Semantic Wiki Engine.

    In other words, it is one of the impportant early innovations leading up to the Wikipedia 3.0 (Web 3.0) vision.

    • The porject and software is called “Semantic MediaWiki”
    • ontoworld.org is just one site using the technology
    • Wikipedia might become another site using the technology 
    • Some more sites using the technology are found here

    Related

    1. Wikipedia 3.0: The End of Google?
    2. Web 3.0: Basic Concepts
    3. P2P 3.0: The People’s Google
    4. Semantic MediaWiki project website

    Posted by Marc Fawzi

    Enjoyed this analysis? You may share it with others on:

    digg.png newsvine.png nowpublic.jpg reddit.png blinkbits.png co.mments.gif stumbleupon.png webride.gif del.icio.us

    Read Full Post »

    Evolving Trends

    July 12, 2006

    Wikipedia 3.0: El fin de Google (traducción)

    Wikipedia 3.0: El fin de Google (traducción)

    Translation kindly provided by Eric Rodriguez

    /*

    Desarrolladores: Este es el nuevo proyecto open source Semantic MediaWiki.

    Bloggers: Este post explica la curiosa historia sobre como este articulo alcanzó 33,000 lectores solo en las primeras 24 horas desde su publicación, a través de digg. Este post explica cuál es el problema con digg y la Web 2.0 y como solucionarlo.

    Relacionado:

    1. All About Web 3.0
    2. P2P 3.0: The People’s Google
    3. Google Dont Like Web 3.0 [sic]
    4. For Great Justice, Take Off Every Digg
    5. Reality as a Service (RaaS): The Case for GWorld
    6. From Mediocre to Visionary

    */

    por Marc Fawzi de Evolving Trends

    Versión española (por Eric Rodriguez de Toxicafunk)

    La Web Semántica (o Web 3.0) promete “organizar la información mundial” de una forma dramáticamente más lógica que lo que Google podría lograr con su diseño de motor actual. Esto es cierto desde el punto de vista de la comprensión por parte de las maquinas versus la humana. La Web Semántica requiere del uso de un lenguaje ontológico declarativo, como lo es OWL, para producir ontologías específicas de dominio que las máquinas pueden usar para razonar sobre la información y de esta forma alcanzar nuevas conclusiones, en lugar de simplemente buscar / encontrar palabras claves.

    Sin embargo, la Web Semántica, que se encuentra todavía en una etapa de desarrollo en la que los investigadores intentan definir que modelo es el mejor y cual tiene mayor usabilidad, requeriría la participación de miles de expertos en distintos campos por un periodo indefinido de tiempo para poder producir las ontologías específicas de dominio necesarias para su funcionamiento.

    Las maquinas (o más bien el razonamiento basado en maquinas, también conocido como Software IA o ‘agentes de información’) podrían entonces usar las laboriosas –mas no completamente manuales- ontologías elaboradas para construir una vista (o modelo formal) sobre como los términos individuales, en un determinado conjunto de información, se relacionan entre sí. Tales relaciones se pueden considerar como axiomas (premisas básicas), que junto con las reglas que gobiernan el proceso de inferencia permiten a la vez que limitan la interpretación (y el uso correctamente-formado) de dichos términos por parte de los agentes de información, para poder razonar nuevas conclusiones basándose en la información existente, es decir, pensar. En otras palabras, se podría usar software para generar teoremas (proposiciones formales demostrables basadas en axiomas y en las reglas de inferencia), permitiendo así el razonamiento deductivo formal a nivel de máquinas. Y dado que una ontología, tal como se describe aquí, se trata de un enunciado de Teoría Lógica, dos o más agentes de información procesando la misma ontología de un dominio específico serán capaces de colaborar y deducir la respuesta a una query (búsqueda o consulta a una base de datos), sin ser dirigidos por el mismo software.

    De esta forma, y como se ha establecido, en la Web Semántica los agentes basados en maquina (o un grupo colaborador de agentes) serán capaces de entender y usar la información traduciendo conceptos y deduciendo nueva información en lugar de simplemente encontrar palabras clave.

    Una vez que las máquinas puedan entender y usar la información, usando un lenguaje estándar de ontología, el mundo nuca volverá a ser el mismo. Será posible tener un agente de información (o varios) entre tu ‘fuerza laboral‘ virtual aumentada por IA, cada uno teniendo acceso a diferentes espacios de dominio especifico de comprensión y todos comunicándose entre si para formar una conciencia colectiva.

    Podrás pedirle a tu agente o agentes de información que te encuentre el restaurante más cercano de cocina Italiana, aunque el restaurante más cercano a ti se promocione como un sitio para Pizza y no como un restaurante Italiano. Pero este es solo un ejemplo muy simple del razonamiento deductivo que las máquinas serán capaces de hacer a partir de la información existente.

    Implicaciones mucho más sorprendentes se verán cuando se considere que cada área del conocimiento humano estará automáticamente al alcance del espacio de comprensión de tus agentes de información. Esto es debido a que cada agente se puede comunicar con otros agentes de información especializados en diferentes dominios de conocimiento para producir una conciencia colectiva (usando la metáfora Borg) que abarca todo el conocimiento humano. La “mente” colectiva de dichos agentes-como-el-Borg conformara la Maquina Definitiva de Respuestas, desplazando fácilmente a Google de esta posición, que no ocupa enteramente.

    El problema con la Web Semántica, aparte de que los investigadores siguen debatiendo sobre que diseño e implementación de modelo de lenguaje de ontología (y tecnologías asociadas) es el mejor y el más usable, es que tomaría a miles o incluso miles de miles de personas con vastos conocimientos muchos años trasladar el conocimiento humano a ontologías especificas de dominio.

    Sin embargo, si en algún punto tomáramos la comunidad Wikipedia y les facilitásemos las herramientas y los estándares adecuados con que trabajar (sean estos existentes o a desarrollar en el futuro), de forma que sea posible para individuos razonablemente capaces reducir el conocimiento humano en ontologías de dominios específicos, entonces el tiempo necesario para hacerlo se vería acortado a unos cuantos años o posiblemente dos

    El surgimiento de una Wikipedia 3.0 (en referencia a Web 3.0, nombre dado a la Web Semántica) basada en el modelo de la Web Semántica anunciaría el fin de Google como la Maquina Definitiva de Respuestas. Este sería remplazado por “WikiMind” (WikiMente) que no sería un simple motor de búsqueda como Google sino un verdadero Cerebro Global: un poderoso motor de inferencia de dominios, con un vasto conjunto de ontologías (a la Wikipedia 3.0) cubriendo todos los dominios de conocimiento humano, capaz de razonar y deducir las respuestas en lugar de simplemente arrojar cruda información mediante el desfasado concepto de motor de búsqueda.

    Notas
    Tras escribir el post original descubrí que la aplicación Wikipedia, también conocida como MeadiaWiki que no ha de confundirse con Wikipedia.org, ya ha sido usado para implementar ontologías. El nombre que han seleccionado es Ontoworld. Me parece que WikiMind o WikiBorg hubiera sido un nombre más atractivo, pero Ontoworld también me gusta, algo así como “y entonces descendió al mundo,” (1) ya que se puede tomar como una referencia a la mente global que un Ontoworld capacitado con la Web Semántica daría a lugar.

    En tan solo unos cuantos años la tecnología de motor e búsqueda que provee a Google casi todos sus ingresos/capital, seria obsoleta… A menos que tuvieran un contrato con Ontoworld que les permitiera conectarse a su base de datos de ontologías añadiendo así la capacidad de motor de inferencia a las búsquedas de Google.

    Pero lo mismo es cierto para Ask,com y MSN y Yahoo.

    A mi me encantaría ver más competencia en este campo, y no ver a Google o cualquier otra compañía establecerse como líder sobre los otros.

    La pregunta, usando términos Churchilianos, es si la combinación de Wikipedia con la Web Semántica significa el principio del fin para Google o el fin del principio. Obviamente, con miles de billones de dólares con dinero de sus inversionistas en juego, yo opinaría que es lo último. Sin embargo, si me gustaría ver que alguien los superase (lo cual es posible en mi opinión).

    (1) El autor hace referencia al juego de palabra que da el prefijo Onto de ontología que suena igual al adverbio unto en ingles. La frase original es “and it descended onto the world,”.

    Aclaración
    Favor observar que Ontoworld, que implementa actualmente las ontologías, se basa en la aplicación “Wikipedia” (también conocida como MediaWiki) que no es lo mismo que Wikipedia.org.

    Así mismo, espero que Wikipedia.org utilice su fuerza de trabajo de voluntarios para reducir la suma de conocimiento humano que se ha introducido en su base de datos a ontologías de dominio específico para la Web Semántica (Web 3.0) y por lo tanto, “Wikipedia 3.0”.

    Respuesta a Comentarios de los Lectores
    Mi argumento es que Wikipedia actualmente ya cuenta con los recursos de voluntarios para producir las ontologías para cada uno de los dominios de conocimiento que actualmente cubre y que la Web Semántica tanto necesita, mientras que Google no cuenta con tales recursos, por lo que dependería de Wikipedia.

    Las ontologías junto con toda la información de la Web, podrán ser accedidas por Google y los demás pero será Wikipedia quien quede a cargo de tales ontologías debido a que actualmente Wikipedia ya cubre una enorme cantidad de dominios de conocimiento y es ahí donde veo el cambio en el poder.

    Ni Google ni las otras compañías posee el recurso humano (los miles de voluntarios con que cuenta Wikipedia) necesario para crear las ontologías para todos los dominios de conocimiento que Wikipedia ya cubre. Wikipedia si cuenta con tales recursos y además esta posicionada de forma tal que puede hacer trabajo mejor y más efectivo que cualquier otro. Es difícil concebir como Google lograría crear dichas ontologías (que crecen constantemente tanto en numero como en tamaño) dado la cantidad de trabajo que se requiere. Wikipedia, en cambio, puede avanzar de forma mucho más rápida gracias a su masiva y dedicada fuerza de voluntarios expertos.

    Creo que la ventaja competitiva será para quien controle la creación de ontologías para el mayor numero de dominios de conocimiento (es decir, Wikipedia) y no para quien simplemente acceda a ellas (es decir, Google).

    Existen muchos dominios de conocimiento que Wikipedia todavía no cubre. En esto Google tendría una oportunidad pero solamente si las personas y organizaciones que producen la información hicieran también sus propias ontologías, tal que Google pudiera acceder a ellas a través de su futuro motor de Web Semántica. Soy de la opinión que esto será así en el futuro pero que sucederá poco a poco y que Wikipedia puede tener listas las ontologías para todos los dominios de conocimiento con que ya cuenta mucho más rápido además de contar con la enorme ventaja de que ellos estarían a cargo de esas ontologías (la capa básica para permitir la IA).

    Todavía no esta claro, por supuesto, si la combinación de Wikipedia con la Web Semántica anuncia el fin de Google o el fin del principio. Como ya mencioné en el artículo original. Me parece que es la última opción, y que la pregunta que titula de este post, bajo el presente contexto, es meramente retórica. Sin embargo, podría equivocarme en mi juicio y puede que Google de paso a Wikipedia como la maquina definitiva de respuestas mundial.

    Después de todo, Wikipedia cuenta con “nosotros”. Google no. Wikipedia deriva su de poder de “nosotros”. Google deriva su poder de su tecnología y su inflado precio de mercado. ¿Con quien contarías para cambiar el mundo?

    Respuesta a Preguntas Básicas por parte de los Lectores
    El lector divotdave formulá unas cuantas preguntas que me parecen de naturaleza básica (es decir, importante). Creo que más personas se estarán preguntando las mismas cuestiones por lo que las incluyo con sus respectivas respuestas.

    Pregunta:
    ¿Como distinguir entre buena y mala información? Como determinar que partes del conocimiento humano aceptar y que parte rechazar?

    Respuesta:
    No es necesario distinguir entre buena y mala información (que no ha de confundirse con bien-formada vs. mal-formada) si se utiliza una fuente de información confiable (con ontologías confiables asociadas). Es decir, si la información o conocimiento que se busca se puede derivar de Wikipedia 3.0, entonces se asume que la información es confiable.

    Sin embargo, con respecto a como conectar los puntos al devolver información o deducir respuestas del inmenso mar de información que va más allá de Wikipedia, entonces la pregunta se vuelve muy relevante. Como se podría distinguir la buena información de la mala de forma que se pueda producir buen conocimiento (es decir, comprender información o nueva información producida a través del razonamiento deductivo basado en la información existente).

    Pregunta:
    Quien, o qué según sea el caso, determina que información es irrelevante para mí como usuario final?

    Respuesta:
    Esta es una buena pregunta que debe ser respondida por los investigadores que trabajan en los motores IA para la Web 3.0.

    Será necesario hacer ciertas suposiciones sobre que es lo que se está preguntando. De la misma forma en que tuve que suponer ciertas cosas sobre lo que realmente me estabas preguntando al leer tu pregunta, también lo tendrán que hacer los motores IA, basados en un proceso cognitivo muy similar al nuestro, lo cual es tema para otro post, pero que ha sido estudiado por muchos investigadores IA.

    Pregunta:
    ¿Significa esto en última instancia que emergerá un todopoderoso* estándar al cual toda la humanidad tendrá que adherirse (por falta de información alternativa)?

    Respuesta:
    No existe la necesidad de un estándar, excepto referente al lenguaje en el que se escribirán las ontologías (es decir, OWL, OWL-DL. OWL Full, etc.). Los investigadores de la Web Semántica intentan determinar la mejor opción, y la más usable, tomando en consideración el desempeño humano y de las máquinas al construir y –exclusivamente en el último caso- interpretar dichas ontologías.

    Dos o más agentes de información que trabajen con la misma ontología especifica de dominio pero con diferente software (diferente motor IA) pueden colaborar entre ellos. El único estándar necesario es el lenguaje de la ontología y las herramientas asociadas de producción.

    Anexo

    Sobre IA y el Procesamiento del Lenguaje Natural

    Me parece que la primera generación de IA que será usada por la Web 3.0 (conocido como Web Semántica) estará basada en motores de inferencia relativamente simples (empleando enfoques tanto algorítmicos como heurísticas) que no intentarán ningún tipo de procesamiento de lenguaje natural. Sin embargo, si mantendrán las capacidades de razonamiento deductivo formal descritas en este articulo.

    Sobre el debate acerca de La Naturaleza y Definición de IA

    La introducción de la IA en el ciber-espacio se hará en primer lugar con motores de inferencia (usando algoritmos y heurística) que colaboren de manera similar al P2P y que utilicen ontologías estándar. La interacción paralela entre cientos de millones de Agentes IA ejecutándose dentro de motores P2P de IA en las PCs de los usuarios dará cabida al complejo comportamiento del futuro cerebro global.

    2 Comments »

    1. […] Acá un recorte directo de la traducción del articulo original. (perdí mucho tiempo tratando de entenderlo, se nota?) por Marc Fawzi de Evolving Trends […]Pingback by DxZone 2.0 (beta) – DxBlog » Blog Archive » Web 3.0? — August 7, 2006 @ 9:03 pm
    2. Es muy interesante. Creo que el artículo de Wikipedia sobre Web 2.0 complementa muy bien este trabajo:

      Bien podría hablarse de la Web 3.0 para la Web semántica. Pero una diferencia fundamental entre ambas versiones de web (2.0 y 3.0) es el tipo de participante. La 2.0 tiene como principal protagonista al usuario humano que escribe artículos en su blog o colabora en un wiki. El requisito es que además de publicar en HTML emita parte de sus aportaciones en XML/RDF (RSS, ATOM, etc.). La 3.0, sin embargo, está orientada hacia el protagonismo de procesadores mecánicos que entiendan de lógica descriptiva en OWL. La 3.0 está concebida para que las máquinas hagan el trabajo de las personas a la hora de procesar la avalancha de información publicada en la Web.

      La clave está aquí al final: la Web 3.0 será protagonizada por robots inteligentes y dispositivos ubícuos. De esto ya ha dicho algo O’Reilly.

      Desde luego estoy de acuerdo con el autor, la Wikipedia semántica será la bomba, pero me temo que será un subconjunto de la social o folcsonómica, porque la semántica tiene limitaciones. Debería explicar esto en algún artículo. Tal vez lo haga en las páginas de nuestro proyecto Wikiesfera, que para eso es más sexy un wiki que un blog. 😉

      Gracias por la traducción.

      Comment by Joseba — November 30, 2006 @ 1:19 am

    RSS feed for comments on this post. TrackBack URI

    Leave a comment

    Read Full Post »

    Evolving Trends

    July 29, 2006

    Search By Meaning

    I’ve been working on a pretty detailed technical scheme for a “search by meaning” search engine (as opposed to [dumb] Google-like search by keyword) and I have to say that in conquering the workability challenge in my limited scope I can see the huge problem facing Google and other Web search engines in transitioning to a “search by meaning” model.

    However, I also do see the solution!

    Related

    1. Wikipedia 3.0: The End of Google?
    2. P2P 3.0: The People’s Google
    3. Intelligence (Not Content) is King in Web 3.0
    4. Web 3.0 Blog Application
    5. Towards Intelligent Findability
    6. All About Web 3.0

    Beats

    42. Grey Cell Green

    Posted by Marc Fawzi

    Tags:

    Semantic Web, Web strandards, Trends, OWL, innovation, Startup, Evolution, Google, GData, inference, inference engine, AI, ontology, Semanticweb, Web 2.0, Web 2.0, Web 3.0, Web 3.0, Google Base, artificial intelligence, AI, Wikipedia, Wikipedia 3.0, collective consciousness, Ontoworld, Wikipedia AI, Info Agent, Semantic MediaWiki, DBin, P2P 3.0, P2P AI, AI Matrix, P2P Semantic Web inference Engine, Global Brain, semantic blog, intelligent findability, search by meaning

    5 Comments »

    1. context is a kind of meaning, innit?

      Comment by qxcvz — July 30, 2006 @ 3:24 am

    2. You’re one piece short of Lego Land.

      I have to make the trek down to San Diego and see what it’s all about.

      How do you like that for context!? 🙂

      Yesterday I got burnt real bad at Crane beach in Ipswich (not to be confused with Cisco’s IP Switch.) The water was freezing. Anyway, on the way there I was told about the one time when the kids (my nieces) asked their dad (who is a Cisco engineer) why Ipswich is called Ipswich. He said he didn’t know. They said “just make up a reason!!!!!!” (since they can’t take “I don’t know” for an answer) So he said they initially wanted to call it PI (pie) but decided it to switch the letters so it became IPSWICH. The kids loved that answer and kept asking him whenever they had their friends on a beach trip to explain why Ipswich is called Ipswich. I don’t get the humor. My logic circuits are not that sensitive. Somehow they see the illogic of it and they think it’s hilarious.

      Engineers and scientists tend to approach the problem through the most complex path possible because that’s dictated by the context of their thinking, but genetic algorithms could do a better job at that, yet that’s absolutely not what I’m hinting is the answer.

      The answer is a lot more simple (but the way simple answers are derived is often thru deep thought that abstracts/hides all the complexity)

      I’ll stop one piece short cuz that will get people to take a shot at it and thereby create more discussion around the subject, in general, which will inevitably get more people to coalesce around the Web 3.0 idea.

      [badly sun burnt face] + ] … It’s time to experiment with a digi cam … i.e. towards a photo + audio + web 3.0 blog!

      An 8-mega pixel camera phone will do just fine! (see my post on tagging people in the real world.. it is another very simple idea but I like this one much much better.)

      Marc

      p.s. my neurons are still in perfectly good order but I can’t wear my socks!!!

      Comment by evolvingtrends — July 30, 2006 @ 10:19 am

    3. Hey there, Marc.
      Have talked to people about semantic web a bit more now, and will get my thoughts together on the subject before too long. The big issue, basically, is buy-in from the gazillions of content producers we have now. My impression is the big business will lead on semantic web, because it’s more useful to them right now, rather than you or I as ‘opinion/journalist’ types.

      Comment by Ian — August 7, 2006 @ 5:06 pm

    4. Luckily, I’m not an opinion journalist although I could easily pass for one.

      You’ll see a lot of ‘doing’ from us now that we’re talking less 🙂

      BTW, just started as Chief Architect with a VC funded Silicon Valley startup so that’s keeping me busy, but I’m recruiting developers and orchestrating a P2P 3.0 / Web 3.0 / Semantic Web (AI-enabled) open source project consistent with the vision we’ev outlined. 

      :] … dzzt.

      Marc

      Comment by evolvingtrends — August 7, 2006 @ 5:10 pm

    5. Congratulations on the job, Marc. I know you’re a big thinker and I’m delighted to hear about that.

      Hope we’ll still be able to do a little “fencing” around this subject!

      Comment by Ian — August 7, 2006 @ 7:01 pm

    RSS feed for comments on this post. TrackBack URI

    Read Full Post »

  • My Dashboard
  • New Post
  • Evolving Trends

    July 20, 2006

    Google dont like Web 3.0 [sic]

    (this post was last updated at 9:50am EST, July 24, ‘06)

    Why am I not surprised?

    Google exec challenges Berners-Lee

    The idea is that the Semantic Web will allow people to run AI-enabled P2P Search Engines that will collectively be more powerful than Google can ever be, which will relegate Google to just another source of information, especially as Wikipedia [not Google] is positioned to lead the creation of domain-specific ontologies, which are the foundation for machine-reasoning [about information] in the Semantic Web.

    Additionally, we could see content producers (including bloggers) creating informal ontologies on top of the information they produce using a standard language like RDF. This would have the same effect as far as P2P AI Search Engines and Google’s anticipated slide into the commodity layer (unless of course they develop something like GWorld)

    In summary, any attempt to arrive at widely adopted Semantic Web standards would significantly lower the value of Google’s investment in the current non-semantic Web by commoditizing “findability” and allowing for intelligent info agents to be built that could collaborate with each other to find answers more effectively than the current version of Google, using “search by meaning” as opposed to “search by keyword”, as well as more cost-efficiently than any future AI-enabled version of Google, using disruptive P2P AI technology.

    For more information, see the articles below.

    Related

    1. Wikipedia 3.0: The End of Google?
    2. Wikipedia 3.0: El fin de Google (traducción)
    3. All About Web 3.0
    4. Web 3.0: Basic Concepts
    5. P2P 3.0: The People’s Google
    6. Intelligence (Not Content) is King in Web 3.0
    7. Web 3.0 Blog Application
    8. Towards Intelligent Findability
    9. Why Net Neutrality is Good for Web 3.0
    10. Semantic MediaWiki
    11. Get Your DBin

    Somewhat Related

    1. Unwisdom of Crowds
    2. Reality as a Service (RaaS): The Case for GWorld
    3. Google 2.71828: Analysis of Google’s Web 2.0 Strategy
    4. Is Google a Monopoly?
    5. Self-Aware e-Society

    Beats

    1. In the Hearts of the Wildmen

    Posted by Marc Fawzi

    Enjoyed this analysis? You may share it with others on:

    digg.png newsvine.png nowpublic.jpg reddit.png blinkbits.png co.mments.gif stumbleupon.png webride.gif del.icio.us

    Tags:

    Semantic Web, Web strandards, Trends, OWL, innovation, Startup, Evolution, Google, GData, inference, inference engine, AI, ontology, Semanticweb, Web 2.0, Web 2.0, Web 3.0, Web 3.0, Google Base, artificial intelligence, AI, Wikipedia, Wikipedia 3.0, collective consciousness, Ontoworld, Wikipedia AI, Info Agent, Semantic MediaWiki, DBin, P2P 3.0, P2P AI, AI Matrix, P2P Semantic Web inference Engine, semantic blog, intelligent findability, RDF

    Read Full Post »

    Evolving Trends

    June 28, 2006

    For Great Justice, Take Off Every Digg

    /*

    (this post was last updated 12:30am on July 6, ‘06)

    This article presents the case against the ‘wisdom of crowds’ and explains the background for how the Wikipedia 3.0: The End of Google? article reached 2 million people in 4 days [according to Alexa.com]

    */

    This article explains and demonstrates a conceptual flaw in digg’s service model that causes biased (or rigged) as well as lowest-common-denominator hype to be generated, causing a dumbing down of society (as a crowd).

    The experimental evidence and logic supplied here apply equally to other Web 2.0 social bookmarking services such as del.icio.us, and netscape beta.Since digg is an open system where anyone can submit anything, user behavior has to be carefully monitored to make sure that people do not abuse the system. But given that the number of stories submitted each second is much larger than what Digg’s own staff can monitor, digg has given the power to the users to decide what is good content and what is bad (e.g. spam, miscategorized content, lame stuff, etc.)

    This “wisdom of crowds” model, which forms the basis for digg, has a basic and major flaw at its foundation, not to mention at least one process and technology related issue in digg’s implementation of the model.

    Let’s look at the simple process and technology issue first before we explore the much bigger problem at the heart of the “wisdom of crowds” model.

    If enough users report a post from a given site as spam then that site’s URL will be banned from digg, even if the site’s owner had no idea someone was submitting links from his site to digg. The fact is that digg cannot tell for sure whether the person submitting the post is the site’s owner or someone else, so there URL banning policy (or algorithm if it’s automated) must make the assumption that the site’s owner is the one submitting the post. But what if someone starts submitting posts from another person’s blog and placing them under the wrong digg categories just to get that person’s blog banned by digg?

    This issue can be eliminated by improvements to the process and technology. [You may skipp the rest of this paragraph if you can take my word for it.] For example, instead of banning a given site’s URL right away upon receiving X number of spam reports for posts from that site, the digg admins would put the site’s URL under a temporary ban and attempt to contact the site’s owner and possibly have the site owner click on a link in an email they’d send him/her to capture his/her IP address and compare it to that used by the spammer. If the IP addresses don’t match then they would ban the IP address of the spam submitter, and not the site’s URL. This obviously assumes that digg is able to automatically ban all known public proxy addresses (including known Tor addresses etc) at any given time, to force the users to use their actual IP addresses.

    The bigger problem, however, and what I believe to be the deadliest flaw in the digg model is the concept of the wisdom of crowds.

    Crowds are not wise. Crowds are great as part of a statistical process to determine the perceived numerical value of something that can be quantified. A crowd, in other words, is a decent calculator of subjective quantity, but still just a calculator. You can show a crowd of 200 people a jar filled with jelly beans and ask each how many jelly beans are in the jar. Then you can take the average and that would be the closest value to the actual number of jelly beans.

    However, if you were to ask a crowd of 200 million to evaluate taste or beauty or whatever subjective quality, e.g. coolness, the averaging process that helps in the case of counting jelly beans (where members of the crowd use reasoning and don’t let others affect their judgment) doesn’t happen in this scenario. What happens instead is that the crowd members (assuming they communicate with each other such that they would affect each others qualitative judgment, or assuming they already share something in common) would converge toward the lowest-common-denominator opinion. The reason for that is because whereas reasoning is used in the case of estimating measurable values, psychology is used in the case of judging quality.

    Thus, in the case of evaluating the subjective quality of a post submitted to digg, the crowd has no wisdom: it will always choose the lowest common denominator, whatever that happens to be.

    To understand a crowd’s lack of rationality and wisdom, as a phenomenon, consider the following.

    I had written a post (see link at the end of this article) about the Semantic Web, domain specific knowledge ontologies and Google as seen from a Google-centric view. I went on about how Google, using Semantic Web and an AI-driven inference engine, would eventually develop into an omnipresent intelligence (a global mind) and how that would have far reaching implications etc. The post was titled “Reality as a Service (RaaS): The Case for GWorld.” I submitted it to digg and I believe I got a few diggs and one good comment on it. That’s all. I probably got 500 hits in total on that post, and mostly because I used the word “Gworld” in the title.

    More than a week after that, I took the same post, the same idea of combining the Semantic Web, domain-specific knowledge ontologies and an AI-driven inference engine but this time I pitted Wikipedia (as the most likely developer of knowledge ontologies) against Google, and posted it with the sensational but quite plausible title “Wikipedia 3.0: The End of Google.” The crowd went wild.

    I got over 33,000 hits in the first 24 hours. And as of the latest count about 1600 diggs.

    In fact, my blog on that day (yesterday) beat the #1 blog on WordPress, which is that of ex Microsoft guy Scobleizer. And now I have an idea of how many hits he gets a day! He gets more than 10,000 and less than 25,000. I know because the first 16 hours I was getting hit by massive traffic I managed to get ahead of him with a total of 25,000 hits, but in the last 8 hours of the first 24 hours cycle (for which I’m reporting the stats here) he beat me back to the #1 spot, as I only had 9,000 hits. I stayed at #2 though.

    Figure 1: June 25 Traffic, the first 16 hours of a 24 hour graph cycle. Traffic ~ 25,000 hits.

    The first 16 hours. Traffic from digg = 25,000 hits

    Figure 2: June 26 Traffic, the last 8 hours of a 24 hour graph cycle. Traffic ~ 8,000 hits.

    The last 8 hours. Traffic from digg = 8,000 hits

    A crowd, not to be confused with individuals (like myself, yourself), aside from being a decent calculator of subjective quantities (like counting jelly beans in a jar) is no smarter than a bull when it comes to judging the intellectual, artistic or philosophical appeal of something. Wave something red in front of it or make a lot of noise and it may notice you. Talk to it or make subtle gestures and you’ll fail to get its attention. Obviously you can have a tame bull or an angry one. An angry one is easier to upset.

    A crowd is no more than a decent calculator of subjective quantities. It is a tool in that sense and only in that sense.

    In the context of judging quality, like musical taste or coolness of something, a crowd is neither rational nor wise. It will only respond to the most basic and crude methods of attention grabbing. You can’t grab it’s attention with subtlety or rationality. You have to use psychology, like you would with a bull.

    As you can see from the graphs of my blog traffic, I’ve proved it. I didn’t just understand it.

    Social bookmarking systems, and tagging in general, amplifies the intensity of the crowd-as-a-bull behavior by attaching the highest numerical values to the most curde, most raw and the lowest common denominator. Now all the sudden, when a post gets 100 digs it reaches escape velocity and goes into orbit. The numerical value attached to posts a la “diggs” when it grows fast acts like a matador that is making audaciously big moves to attract the bull for the kill. People rush to see such posts as they rushed in tens of thousands to see the “Wikipedia 3.0 vs Google” post. Yet it’s basically the same post as the one I did on GWorld over a week ago that only got a few diggs.

    There is no comparison between the wisdom and rationality of an individual and that of a crowd. The individual is infinitely wiser and more rational than the crowd.

    So these social bookmarking systems need to be based on a more evolved model where individuals have as much say as the crowd.

    Remember that many failed social ideologies were based on the the idea of favoring the so-called “wisdom of crowds” over individualism. The reason they failed is because collectivist behavior is dumb behavior and individual judgment is the only way forward.

    We need more individuality in society not less.

    Censored by digg

    This post was censored by digg’s rating system.

    However, in a software-enabled rating system, such as digg, reddit, del.icio.us, netscape, etc, there is no way to guarantee that manipulation of the system by its owner does not happen.

    Please see the Update section below for the explanation and the evidence (in the form of a telling list of censored posts) behind why digg itself, and not just some of its fanatic users, may have been behind the censoring of this post.

    Note: a fellow wordpress blogger published a post called Digg’s Ultimate Flow which links to this post. It has not been buried/censored yet (June 29, ‘06, 5:45pm EST). It’s not to be confused with this post. The reason it hasn’t been buried is because it presents no threat to digg. They can sense danger like an animal and I guess I’ve scared them enough to bury/censor my post. The other me-too post that I’ve just mentioned does not smell as scary. It’s really sad that digg and sites like it are feeding the crude animal-like, instinctive, zero-clarity behavior that is the ‘unwisdom’ of crowds.

    The truth is that digg and other so-called “social” bookmarking sites do not give us power, they take it away from us.

    Always. Think. Innovate. Do not follow.

    But you may want to follow this link to share your view with other digg users for what it’s worth.

    Correction

    I’ve just noticed that this blog is ahead of Scobleizer again at #1. I’ve had 7,796 hits since 8:00pm EST, June 28, ‘06 (yesterday.) It’s 8:00pm EST now, on June 29, ‘06.

    Related

    1. Wikipedia 3.0: The End of Google?
    2. Unwisdom of Crowds
    3. Reality as a Service (RaaS): The Case for GWorld
    4. Digg This! 55,500 Hits in ~4 Days

    Beats

    1. Can U Digg It?

    Posted by Marc Fawzi

    Enjoyed this analysis? You may share it with others on:

    digg.png newsvine.png nowpublic.jpg reddit.png blinkbits.png co.mments.gif stumbleupon.png webride.gif del.icio.us

    Update

    The following is a snapshot of digg’s BURIED/CENSORED post section as of 4:00am EST, June 29th, ‘06. This post was originally titled “Digg’s Biggest Flaw Discovered.” Note that anything that is perceived as anti-digg, be it a bug report or a serious analysis of digg’s weaknesses, is being censored.

    Digg’s Biggest Flaw Discovered buried story
    submitted by evolvingtrends 21 hours 35 minutes ago (via http://evolvingtrends.wordpres…)

    An actual proof of a major flaw at the foundation of digg’s quality-of-service model

    category: Programming

    Now even CNET wants its stories endorsed by Digg community
    submitted by aj9702 1 day 17 hours ago (via http://news.com.com/Attack+cod…)

    Check it out.. CNET which is number 72 on Alexa rankings wants its stories endorsed by the Digg community. They have a digg this link now to their more popular stories. This story links to the news that exploit code is out there for the RRAS exploit announced earlier this month

    category: Tech Industry News

    Dvorak: Understanding Digg and Its Utopian Idealism buried story
    submitted by kevinmtu 1 day 18 hours ago (via http://www.pcmag.com/article2/…)

    Dvorak’s PC magazine article on the new version of Digg and its flaws, posing many interesting points.For example, “What would happen to the Digg site if the Bush-supporting minions in the red states, flocked to Digg and actively promoted stories, slammed things they didn’t like, and in the process drove away the libertarian users?”

    category: Tech Industry News

    Pros and Cons of Digg v3
    submitted by jobobshishkabob 2 days ago (via http://thenerdnetworks.com/blo…)

    Well, Digg version 3 got released today. It is really nice and has many great features. But everything has its flaws…. heres a list of pros and cons of the new Digg.com

    category: Tech Industry News

    Easy Digg comment moderation fraud buried story
    submitted by Pooley 2 days ago (via http://www.davidmcmanus.com/st…)

    I’ve found a bug in digg.com. A flaw in the way I ‘digg’ a comment, by clicking the thumbs up icon, allows me to mark up a comment multiple times.

    category: Tech Industry News

    Are oil companies astroturfing digg by downmodding the unfavorable comments in global warming discussions? We can’t know for sure that they ARE. However, we can be sure that they CAN.

    Tags:

    Semantic Web, Web strandards, Trends, wisdom of crowds, tagging, Startup, mass psychology, Google, cult psychology, inference, inference engine, AI, ontology, Semanticweb, Web 2.0, Web 2.0, Web 3.0, Web 3.0, Google Base, artificial intelligence, AI, Wikipedia, Wikipedia 3.0, collective consciousness, digg, censorship

    52 Comments »

    1. Pingback by Anonymous — June 28, 2006 @ 7:47 am
    2. Wow, it’s so refreshing to find an hontest and intelligent post about such an abysmal web “service,” these days. Marc, I must commend you on a job very well done here and thank you for giving me hope that there may be a few people left on the ‘net that aren’t completely full of shit (or just completely ignorant).

      I posted my thoughts on the flaws inherent in Digg a while back and though not as eloquent as yours I think we both arrived at similar conclusions. The model I suggest from the outset is one I believe you allude to by the end of your post. I hate to drop links (especially my own), but it would be a lot easier than re-hashing all my thoughts in a comment, so feel free to check out my take on it if you like. I’m still chuckling here over the fact that you actually have the graphs and posts as hard evidence to show just how ridiculous things have gotten.

      Cheers!

      Ja

      Comment by Ja — June 28, 2006 @ 9:25 am

    3. Finally, someone like me!

      Check out this comment from a Digg fan on the Digg comments:

      “digg doesn’t proport to be “wise” news, just news that is popular. i reported as lame. isn’t that ironic? i hope this was your blog.”

      That … explains everything. 🙂

      What we really need is a Coup d’état 2.0!

      Cheers,

      Marc

      Comment by evolvingtrends — June 28, 2006 @ 9:38 am

    4. Good post and great food for throught. In the book, James Surowiecki distinguishes between Information Cascades – where everyone follows everyone else – and properly wise crowds – where everyone thinks independently but the “correct answer” is the median of all their responses. I think digg is arguably susceptible to information cascades – for many users, the only stories they see are already on the front page and thus only news that has already been promoted gets promoted further. This can create some quite bizarre valuations for stories.

      Comment by Ian — June 28, 2006 @ 10:57 am

    5. I think you should copyright “Semantic WidiggWeb 3.0″, write a business plan, launch a beta and start looking for seed money (:-)

      Thanks for the excellent illustration of “we are the people” propaganda mechanisms

      Comment by Bebop — June 28, 2006 @ 11:16 am

    6. Ah! Finally people that get the plot!

      Where were you guys all along!?

      🙂

      Marc

      Comment by evolvingtrends — June 28, 2006 @ 11:50 am

    7. Great post – like you point out at the end, there are many parallels to the political world when you consider the love affair that some have with the “wisdom of the crowd”.

      Also, you touch on this within the SPAM section but another key flaw to these often anonymous systems are that a deviant segement will always try to hijack the results. Often when I discuss these site with the people that fully buy into the “crowd will solve all” they just can’t grasp that there will always be an element that will use the system for the own ill will.

      By the way, this post feels alot like a mission statement for a new company OR thesis for a research study.

      Comment by David — June 28, 2006 @ 2:49 pm

    8. You make good arguments. I can tell you think this through quite a bit. However, I believe you overlook some of the finer aspects of the concept, correct me if I’m wrong.

      Digg is a best effort website, it is not perfect by any means, obviously. The concept is that it is more democratic because the content is user-defined rather than editor-defined. However, even as of today, we are not yet sure that a democratic form of community is the end-all governing ideology we all believe it is. It has been found to be the most fair form of community, but fair is far from effective or efficient.

      Any concept delivered to a user from a discriminating outside source must be ranked for usefullnes. This includes most of life, including morality, entertainment, news, and laws. Therefore the quality of what the user sees is defined by what the source defines as useful.

      At Digg, the average user does not digg something up to the front page, they merely look and click. The users that actually drive articles to the front page are often very interested in the topic for which they searched. I believe that if you allow an moderate sized open group of interested individuals to deliver information to the masses, you get more effective information than if you have a closed pool of discriminators. The system that digg created can be “gamed” like almost any system on the web that allows interraction with users and has “political” clout (the ability to change people’s minds). Digg has much in common with the volunteer-democratic information system that is WikiMedia.

      I believe that Wikipedia harnesses the knowledge of concerned and interested people, not necessarily the best informed or wisest. It only happens through the laws of nature that the most concerned and interested individuals also happen to be well informed and wise. It is the same dynamic that Digg et al utilize to drive link publication.

      The end result of Digg is that you have a relatively small number of very interested individuals showing a selected amount of information to a mass of people. The mass of people then can determine on their own what is correct or not. It is herd-mentality, but to much less an extent than a single individual telling everyone what to think (which you get in traditional media).

      Comment by John — June 28, 2006 @ 6:47 pm

    9. Perhaps a hybrid-digg-wiki tech news site should be established to grant individual agency while controlling red flag traffic trolling 🙂

      Comment by michael — June 28, 2006 @ 6:52 pm

    10. “I believe that if you allow an moderate sized open group of interested individuals to deliver information to the masses, you get more effective information than if you have a closed pool of discriminators.”

      I definately agree with this. Allowing all casual internet news trollers to rate stories really does nothing but overpromote the stories that are already popular. An average user just sees the already popular stories and then diggs them. The whole point of digg and sites like it is to streamline news and give you only what you want to see so you don’t waste time reading things you don’t care about, so most people just skip over the low rated links.

      It seems like it would be less flawed if there were a group of designated people who actually read all the stories giving each link a rating and then having everyone else give it a second rating. You’d have two ways of judging whether you want to read a link, what people who know what they’re talking about think of it, and what the general public thinks.

      Comment by Aaron — June 28, 2006 @ 8:30 pm

    11. You know, I hadn’t even heard of digg until I read your blog. Very informative. Thank you.

      Comment by Panda — June 28, 2006 @ 9:03 pm

    12. Now that is a funny graph! From a flat line to 25,000. Interestingly, Scoble had a similar spike when news broke he was leaving Microsoft – from an average of 10,000 to 90,000. This was doubtless influenced by, although not as dependent on, Digg et al.

      Any system has positives and negatives, and which is which will be viewed differently depending on who is looking. One point regarding Digg’s weaknesses (which you highlight so eloquently) – what’s the alternative? Getting your site mentioned in the New York Times? What are the chances of that? I think forced to choose beween a world with Digg/Reddit/Slashdot/BoingBoing/Fark etc and a world without them, I would choose with. I prefer flawed mob mentality to wisdom being handed down from on high.

      A second point: Digg is definitely flawed. Do something better. Digg came out of nowhere to gain the influence it has now. Their very existence proves they are not the end of the line. Reddit doesn’t have Digg’s influence (yet) but it is an alternative. If I was a betting man, I’d say 12 months from now there will be a site the does to Digg what Digg did to Slashdot. And the more of these content aggregator/social bookmarking/memedigger sites there are, the more egalitarian the results will be.

      A third point: a one-off spike like the one you saw is almost meaningless if you can’t capitalise on it. I have seen so many sites pop up on WordPress’s listing then disappear without a trace. You have the insight and depth of writing to build some long term audience with the help of that one-off spurt of attention. I hate seeing wasted opportunities so I’m glad you look like capitalising on yours.

      Comment by Mr Angry — June 28, 2006 @ 10:48 pm

    13. And as an addendum, on today’s Digg front page is
      http://www.valleywag.com/tech/google/five-reasons-no-one-will-replace-google-183892.php
      Which is building off your post.

      Dude, you are officially a meme.

      Comment by Mr Angry — June 29, 2006 @ 12:05 am

    14. Very interesting – I just opened a digg account and was looking into what all the hype was about, this post answers my intrigue.

      Comment by Leon — June 29, 2006 @ 12:12 am

    15. Please see my latest Update at the bottom of the post. And click the link in there to the “censored area.”

      Given that there is no way to know if digg is manipulating the system (which they can do very easily by creating fake users which allows them to bury or censor stories they don’t like with the click of a button) I contend that they must be engaged in unethical undemocratic censorship (i.e. rigging the system.)

      So a Web 2.0 entrepreneur who is dictatorial at heart can easily disguise himself as a democratic Web 2.0 service! Wooohooo! Welcome to the new age of Web-enabled democtaric dictatorship.

      It will suck if that is the case, and given that there is no way to tell if they’re rigging the system I can see a high probability for abuse of power (after all they’re human.)

      Digg needs to be re-thought as something other than a bad idea wrapped in deception and glazed with fanatacism.

      If Google’s mantra is Do No Evil, Digg’s must be Do Be Evil.

      Marc

      Comment by evolvingtrends — June 29, 2006 @ 12:58 am

    16. I agree with you totally. I actually hope that I never get a story “Dugg” by anyone. The reason being that I think it gives you a false representation of readership, and certainly long term watchers. I want people to come to my blog to read what I have written and I hope that it is written well. I do not want to be just “judged” by the crowd factor. In addition I don’t think that hits mean or rather denote quality. As you have pointed out – change the name, say if differently and it is judged differently. Crowd mentality. I have written a similar post along the ideas I have just mentioned. Have a look I would be interested in your comments: http://roostersrail.wordpress.com/2006/06/22/obligatory-non-conformism/
      Just because someone has had 100000 hits does not mean that it was worth reading, it would seem that popularity rides over quality & content at times.

      Thanks for the analysis I enjoyed the read.

      Comment by The Rooster — June 29, 2006 @ 3:07 am

    17. Great post, I liked the way you dissected digg. I myself said that digg is no longer that relevant. I think users will still use digg, while others will use other aggregation systems.

      I used to go on digg more than once a day. Now, I think I go every few days, even then. There are a lot of boring things that make it to the front page and even if I wanted to bury them all, I just don’t care that much about digg to do them that service.

      Mob mentality is rarely a good thing. It has been proven that mob mentality actually lowers the effective reasoning levels. Same thing goes with digg.

      Comment by range — June 29, 2006 @ 4:34 am

    18. You are absolutely right. But Digg would never admit that because first of all its a cashcow for its owners (have you seen that they now have FIFTEEN employees?), and the users are also fanatic.

      I think Digg will see a bust at some point soon. They absolute amount of hype behind what they do and the continued rejecting of evolving and improving the site (breaking the site into already existing categories isn’t a part of that) will drag them down. Not to mention the ridiculous amounts of money they throw around. I still can’t believe that they have 15 employees. Reddit is run by a team of 2 out of their apartment.

      Comment by AL — June 29, 2006 @ 8:30 am

    19. You got called a troll by Valleywag, Marc. That’s huge.

      With all I see about Digg in the news lately, Al, I suspect that at least 10 of those employees are necessary to deflect attention from people who are actually working.

      Comment by farlane — June 29, 2006 @ 9:20 am

    20. Yeah I saw that…

      What really impressed me, Andy, is when Markus left a comment under the “From Mediocre to Visionary” post… He’s a cult hero!

      Markus is the guy from plentyoffish

      Marc

      Comment by evolvingtrends — June 29, 2006 @ 9:31 am

    21. Great analysis. I rarely read the Digg comments. Digg is just a way to tons of sites into a few that may have an interesting story to tell.
      If you wear size small t-shirts, then the phrase “Never Underestimate the Power of Stupid People in Large Groups” doesn’t fit. But “Digg” fits and conveys the same message.
      –Andy

      Comment by hamcoder — June 29, 2006 @ 9:58 am

    22. Suppose digg has a secret editorial bias. Depending on how obvious it is, sooner or later, readers will detect it and complain about it on Digg and elsewhere, and eventually go elsewhere. Unless competition were suppressed, the situation will tend to produce whatever readers select for, whether they get to vote with their fingers or their eyeballs.

      Speaking of reader selection bias: maybe a digg should be weighted differently based on how many diggs there are already, when the digg occurred and ultimately the reputation of the digger. It might be possible this way to reduce the importance of votes by readers with nothing better to do or readers who were simple amplifying the established trends. Of course maybe Digg does something like that already….

      Comment by Pictographer — June 29, 2006 @ 11:13 am

    23. I agree with your comment on user bias and I was talking about that exact same idea in another post on Standardized Tagging when I mentioned “user-behavior-based weighing factors.”

      As far as users being able to tell when the system is being rigged. I disagree.

      It’s possible to fool the users by having fake user accounts that bury/censor posts but not necessarily make any comments…. How could you tell a machine from human if the users are allowed to bury without demonstrating that they’re not machines?

      Marc

      Comment by evolvingtrends — June 29, 2006 @ 11:23 am

    24. I don’t see the big deal about digg. I’ve only stumbled across it a few times thanks to Google and never had any urge to stick around. You only need to read the comments and petty bickering on there to work out what it’s all about in about 3 minutes.

      The rest of us couldn’t care less what happens on there. The internet’s a big enough place to find plenty of other stuff to read.

      Comment by Rees — June 29, 2006 @ 11:33 am

    25. I claim the readers can detect bias, not its source (e.g. rigging).

      The quality of the content trumps the selection process for most users. All publications or forums have biases. How could it ever be otherwise? The key question is, “Can readers find the good stuff?” Whether that good stuff be news, entertainment, information, opinions, etc..

      If the digg staff exercise secret editorial power, should we care?

      When we do care, we need something transparent, distributed, and at least slightly expensive. Unless there is a cost associated with expression that is greater than the expected return of spamming, the spammers will come. Somehow the cost must translate into a person’s time or money. It could be community service, like moderation or answering newbie questions; it could be resources like cpu cycles or network bandwidth, or it could be cash.

      Comment by Pictographer — June 29, 2006 @ 12:18 pm

    26. As a newbie to both the computing world and the blogging world but an old hand at reporting small town politics, I really appreciated the work that went into writing this article and also the confidence it gave me because I am a digg subscriber and had already reached the conclusion that mob mentality leads to – control in accord with the lowest common denominator. Allowing all casual internet news trollers such as myself to rate articles achieves nothing more than overpromoting the ones that are already popular and sidelining others that ought to have received attention.

      Comment by timethief — June 29, 2006 @ 12:51 pm

    27. I whole heartedly agree with this post, but congrats on your Psuedo fame 😀 I was happy when my blog reached 43 hits yesterday. Youtube has some of the same problems, though I like the service in general.

      Comment by Amanda — June 29, 2006 @ 4:05 pm

    28. Aw man, I got so busy yesterday I didn’t have time to partake in the Coup d’état 2.0, lol. I only managed to get in one snarky remark in the discussion before I had to bail. I checked later and nobody even bothered to flame me! What do I have to do these days to get some idiot coming after me like I just paid his momma in pennies? It used to be so easy to entrap and toy with the suckers. 😉 Hehehe. Nifty, I got +7 diggs for my comment though, whatever the hell that translates to in terms of ANYTHING.

      Anyhow, I just checked and you’ve got 90 diggs man. Way to fight the system! Of course in your category “How to Make Your AJAX Applications Accessible – 40 Tutorials and Articles” has 1105 diggs and all the page turns out to be is a sloppy link fest to all these different articles about AJAX accessibility problems, including very outdated ones and most of which just state that there needs to be accessibility solutions but aside from workarounds and such there’s no clear solutions. Somehow I think it might not have gotten quite as many diggs had he had titled it more accurately, “40+ links to outdated articles telling you what a stupid wanker you are for diving into the dhtml 2.0 fad with no parachute.” Hrm.

      Speaking of which, did you stick your post in the programming section or otherwise? Digg’s Categories remain completely useless. Can someone please enlighten me as to what’s changed in the new Digg aside from moving things around a bit? Granted, I’m not a digg fanatic, but I didn’t see anything terribly innovative.

      I’m telling you right now as I’ve probably mentioned before, the key is DiggScrobbler! Perhaps minus the gay name.

      I don’t understand why all these community powered sites can’t go the extra distance to make them personalized by taste/preference powered by other users most statistically similar or marked as trusted… of course such a system needs to take into account dislikes as well as likes. I don’t read movie reviews, for example because, what do I care about what some idiot I don’t know thinks about the movie? There’s no context, no relationship there. I’d rather get ratings/reviews from likeminded people that tend to like and dislike the same stuff that I like and dislike. Is it a hard concept to fathom? Instead we rather have hReviews so we can have the same crappy reviews everywhere. Superb!

      Bah, sorry for the rambling and the ranting Marc… it’s just when ya gotta go, you gotta go, ya know? 😉

      Oh and FWIW there have been some stories in the past with pretty compelling evidence that one of the guys behind Digg was fixing some of the digging, so it’s not unthinkable… but that story was a while ago and I don’t recall where I read it.

      Comment by Ja — June 29, 2006 @ 6:04 pm

    29. I’m sure the founder is rigging it to protect his prospects for a $100M+ exit valuation!

      I did digg it initially under programming.

      I’m making t-shirts (don’t ever believe me when I start the sentence with “I’m manking t-shirts”) that will say “For Great Justice, Remove Every Digg.”

      Obviously, that would only help them.

      sux0rs!

      Marc

      Comment by evolvingtrends — June 29, 2006 @ 6:15 pm

    30. Truely a really great post, and an interesting look at an issue with the digg concept. I myself have definately noticed traces of this phenomena happening on a smaller scale with my own blog. The web community really is something thats up one minute, and down the next.

      Great post, you have a great blog, i’ll be sure to stop by here more often 🙂

      Comment by Justin — June 29, 2006 @ 7:22 pm

    31. Great follow-up discussion and comments. I find myself too occupied with writing and posting to consult the reddits and diggs as well as newsvine anymore. I like finding new content myself, sifting through technorati and WordPress top blogs. What is good about blogrolls is that most blogs have one and if you like a blog, you can find similar blogs or related blogs through their blogrolls.

      Anyways, I think that’s it on digg.

      Comment by range — June 29, 2006 @ 7:37 pm

    32. A reader on another blog asked:

      #
      Marc: Don’t you think this is the case with any system of information dissemination?

      Comment by dr. gonzo — June 29, 2006 @ 5:30 am
      #

      My response was (fyi):

      It’s the case with any software-enabled rating system, including electronic voting machines, but I am not implying that the latter kind of systems have been rigged. Those are operated with much stricter oversight.

      As far as informattion dissemination systems, you can point to Google’s Chinese version as an example of rigged, biased content.

      But digg and social bookmarking sites in general have leveraged, intentionally or inadvertently, the false notion of “wisdom of crowds.” That is what I’m attacking in specific, i.e. that such systems/models don’t lead to better judgment than that of an individual blogger or a newspaper/TV/radio editor. They lead to a much worse judgment mainly because they employ the false notion of “wisdom of crowds.”

      Not saying that CNN or Fox News are not rigged, but I’m saying that a New York Times editor, Lorelle, yourself, myself can have opinions that are infinitely better in quality than the opinion of any random crowd. I should also make the distinction between a crowd of scientists and a random crowd. The crowd on digg is a random crowd. However, in general, even a crowd made of rocket scientists is bound to produce the lowest common denominator opinion for that crowd. Individual rocket scientists may have a far better opinion.

      Marc

      Comment by evolvingtrends — June 29, 2006 @ 10:09 pm

    33. I agree with John, that it is the question of people most interested in a specific topic rather than only a crowd building an opinion. Digg probably attracted the information overdosed net-denizens, discovering for the first time a source of power within this universe, power they may have felt lacking in all the ramifications of web-information jungle and an ability to bring more structure to this realm. I can well believe that there are people out their , like You and me, who take it very seriously. But inevitably the novelty of Digg may also have attracted the crowd, people looking for new stimuli, new trends and obsessively engaged in crowd-phenomena, but I suspect that will wear off in time, leaving us the “naturally” selected rest, and some of them, and here I endorse the opinion of Marc, may have vested interests.

      So what do you have at the end? Professional Journalism against Digg.

      But professional journalists, editors of magazines and other opinion-builders from the media need not get scared. If I respect the opinions of a critical author or a journalist from Atlantic Monthly, I would certainly pay more attention to his views rather than to pure trendy phenomenological sensation that come and go away as high amplitude frequencies and die without really changing any basic structural modalities, evident to a great extent at Digg. Same goes, when I intend to purchase a gadget, I would seek an opinion in dedicated news-agency, expert and a more comparative community of expert opinions to make a decision.
      The customer loyalty built by such opinion-builders is an accumulated asset from years of publication, that Digg can not jeopardise.

      And Wikipedia is a step ahead of DIGG. You have a button at the top for history. Very transparent, eh!

      But the fear of a company or political party mixing in Digg as well as the fear of the evolution of a new big-brother is real, but in which sphere of existence is not real? The only remedy is a more varied. less-homogenous NET, and thank Heavens we still have so many mouths_____ here for example the Zdnet!

      Cheers for a more egalitarian world
      And don’t be afraid of the dictatorship of the crowd!

      Mushtaq

      Comment by mushou — June 30, 2006 @ 8:23 am

    34. A very interesting post – and some strange reactions from Digg.

      It reminds me of the old phenomenon of being slashdotted.

      I did learn something though – from now on I need to work on my headlines! I’ve rarely gotten more than a handful of Diggs for anything.

      Comment by analysis — June 30, 2006 @ 1:55 pm

    35. It’s politics. Web 2.0, if there is such a thing, is about social networking. Whereas the ‘old’ web was about content controlled by a single entity, the owner of the site, the ‘new’ web let’s it’s users in control. It’s despotism vs democracy.

      It’s an unsolved problem. Nobody wants dictatorship, whether they’re running a country or a website. Even the so-called ‘benevelent dictators’ lose the trust from their communities at some point. The other extreme, true democracy, is just inefficient and dumb as dictatorship is unfair. We all know how history ’solved’ this problem. It’s called bureaucracy and we know how well that works.

      So, I welcome initiatives like wikipedia, digg and del.icio.us, and I even participate in the latter 2, because I like them as experiments, if nothing else. The web is this wonderful primordial soup that brings us new things and like most things primordial, it’s mostly crap. You can’t deny there’s much worse stuff on the web than the flawed system of digg.

      Moreover, these flawed systems will either evolve or be replaced by ‘better’ systems, possibly incoporating fixes like the one you outlined (although your proposed method of spam-tag verification sounds an awful lot like bureaucracy to me).

      I’m really glad you pointed out these weaknesses in these systems, for articles like yours form the nessecary complement to these ‘experiments’.

      Maybe it’s time for a system with a little more structure and rules, based on the experience from the first wave of these social networks, but I wonder what the crowd at large would think of that..

      remcoder

      Comment by remcoder — June 30, 2006 @ 3:32 pm

    36. The ‘crowd at large’ CAN’T “think.”

      It’ll have to be individuals who come up with the better solution.

      Marc

      Comment by evolvingtrends — June 30, 2006 @ 9:09 pm

    37. My last sentence was only meant to be ironic, implying what you spelled out and further implying that any ’solution’ by an individual necessarily needs acceptance by the ‘crowd at large’ in order to work.

      Comment by remcoder — July 1, 2006 @ 6:44 am

    38. Digg is a generator of biased and lowest-common-denominator hype that uses the ‘unwisdom’ of crowds as the only editorial control.

      It is spamming our culture with biased/lowest-common-denominator hype.

      You’re talking about a different problem domain.

      Approval by the crowd normally happens through ‘taste makers’

      If you make the crowd the taste maker, which is what digg does, then you get biased and lowest-common-denominator opinions of acceptance or rejection.

      That is not how modern society works. Digg is taking us backwards.

      You may want to see my post on the Hunter Gatherer parallels with Web 2.0

      Cheers,
      Marc

      Comment by evolvingtrends — July 1, 2006 @ 6:47 am

    39. I find it ironic that people can’t understand the reflexive reactions of US govt insiders and neocons when they dismiss well laid out arguments against their policies and positions, yet we see the digg insiders and supporters doing the exact same thing when confronted with articles like yours and with individual digg comments/commenters who don’t follow the party line.

      The ignorance of crowds is happening in real life and in the Web 2.0 world.

      Comment by Bobo the monkey — July 2, 2006 @ 11:06 am

    40. My first time reading this blog. Excellent post and analysis. Who are your philosophical influences?

      I taste a Nietzschean flavor. The concerns are similar. Nietzsche’s writing is essntially a message to future individuals to beware of the (d)evolutionary averaging trend effected by the modern socialization of mass or popular culture. Yet his anit-collectivism was not merely conservative, but rather that of a futurist interested in making way (via powerful critique of morality) for the expansion of creative, evolutionary possibilities open to individuals.

      Comment by Thomas McDonald — July 4, 2006 @ 11:30 pm

    41. I haven’t read Nietsche but I think I have a vague idea of his philosophy.

      I do not really get into philosophy, morality, ideology or religion.

      I try to limit myself to experimenting, producing evidence and drawing conclusions, or using conclusions that have been made in the confines of formal theories. Obviously, I also make use of existing evidence, so it’s not all so rigorous and brain wrecking.

      Anything outside of the methods above I consider as philosophy, ideology or religion, and I don’t really get into that on this blog or anywhere.

      I believe that I’m agnostic with respect to everything except the logic I’ve outlined here.

      I enjoy all kinds of ideas as long as they don’t expose pure idiological, philosophical or religious themes.

      Marc

      Comment by evolvingtrends — July 5, 2006 @ 12:06 am

    42. […] As hypothesized in 1 and established in 2, a ‘crowd’ (not to be confused with a hierarchical organization) is not characterized by wisdom. […]

      Pingback by Evolving Trends » The ‘Unwisdom’ of Crowds — July 5, 2006 @ 1:07 am

    43. […] It’s All About PoetryDigg 4.0: The End of Digg 3.0?The ‘Unwisdom’ of CrowdsOn The Co-Evolution of Man and Machine (Independence Day Special)The Geek VC Fund Project: 7/02 UpdateDigg This! 55,500 hits in ~4 DaysWeb 3.0 vs GoogleFor Great Justice, Take Off Every DiggWikipedia 3.0 and Google (Response to Comments)Wikipedia 3.0: The End of Google? […]

      Pingback by Evolving Trends » Hierarchies, Crowds, Democracies and Dictatorships — July 6, 2006 @ 4:45 am

    44. […] For Great Justice, Take Off Every Digg […]

      Pingback by Evolving Trends » From Web 2.0 to Web 2.5 — July 9, 2006 @ 8:17 am

    45. Finally some people that get the plot!

      Comment by Jay — July 10, 2006 @ 1:08 am

    46. […] On top of that, we don’t get it.  We all feel entitled to an opinion in this democratic age.  Like the taxi driver, we do not realise that we cannot think.  We all feel that our opinion on – for example – evolution is as valid as the next person, even if the next person is a geneticist or an anthropologist or a palaeontologist and we aren’t.  In fairness, the failure is in our education system where people are encouraged to ‘think for themselves’ without being taught how to think critically or being given the basic tools of analysis.  This is the downside of democracy.  We dumb down to our lowest common denominator.  I am not going to argue against democracy: as Amyarta Sen points out, it is the only demonstrable safeguard against famine for a start. […]

      Pingback by Aphra Behn – danger of eclectic shock » Truth, stardust and comfort blankies — July 10, 2006 @ 1:58 am

    47. […] Bloggers: Este post explica la curiosa historia sobre como este articulo alcanzó 33,000 lectores solo en las primeras 24 horas desde su publicación, a través de digg. Este post explica cuál es el problema con digg y la Web 2.0 y como solucionarlo how to fix the problem. — Relacionado: […]

      Pingback by Evolving Trends » Wikipedia 3.0: El fin de Google (traducción) — July 12, 2006 @ 4:09 pm

    48. […] This is a label given to the power users of digg. Since it has been shown and proven that 60% of the front page topics and posts are controlled by 0.03% of users, the term was accurate. The thing that most people don’t realize, is that some of the power users are either […]

      Pingback by memoirs on a rainy day » Calacanis and the digg mafia or how monetizing the blogosphere is getting some reactions — July 20, 2006 @ 5:06 pm

    49. Agree with the comment on these tagging sites. It just confirms once again that there are by far more followers than leaders and independent thinkers and it is easy to fall into the trap of anonymously follow what’s defined as popular opinion expressed collectively by the crowd.

      Comment by Joe Buhler — July 25, 2006 @ 10:55 am

    50. […] This flawed approach leads to comparisons between Digg and New York Times. Does it mean that Digg is better than New York Times and we can really live without professional journalism? In the comparison there is no consideration for quality or origination of the content. Don’t get me wrong, I am a supporter of participation and I like sites like Slashdot and Digg. However they cannot do what New York Times does. Whereas the same number game also creates an air of doubt about the basic model of Digg. Digg is one of the best social bookmarking sites not because of the numbers but because it represents a unique idea and provides ease of operation and communication between users. Digg is a culture rather than a site, and hence it is so popular. […]

      Pingback by iface thoughts » Blog Archive » Why Is The Web Quantitative? — August 30, 2006 @ 2:01 pm

    51. Wow. What can I say? You’ve opened up a whole universe for me like a global mind and I must investigate digg and Wikipedia 3.0 concepts fully. Thank you.

      Comment by Geoff Dodd — February 24, 2007 @ 7:59 am

    52. […] For Great Justice, Take Off Every Digg […]

      Pingback by H y p e r l o g i c 1.0.0.1.0.1 « Evolving Trends — January 3, 2008 @ 3:07 pm

    RSS feed for comments on this post. TrackBack URI

    Read Full Post »

    Older Posts »

    %d bloggers like this: