December 17, 2008
Kevin, thank you for posting this beautifully crafted and thoughtful piece.
Here I offer one answer to your question
“what evidence is there that we are headed for level III – the internet/web as an autonomous, smart superorganism?”
For the sake of this presentation I adopt three conventions:
1) the INTERNET is the sum of all the hardware of the net – servers, routers, fiber-optic trunk lines, etc. - the brain, if you will.
2) the WEB is the sum of all the humanly readable content – blogs, articles, web pages - the mind, if you will
3) the FLOW is the dynamic piece - flow of packets and flow of web pages.
An essential next-step for the Web in its ascent into intelligence (awareness or consciousness is quite distinct) is the capacity to READ and UNDERSTAND its own processes.
This understanding of self will address each of the three items above: hardware, software, and flow.
In this post I focus only on 2) above: the humanly readable content of the web (essentially all of its current contents). This content will soon be processable by knowedge servers on the web. When machine-understandability is eventually achieved, this will be a giant leap forward toward a SMART WEB.
The goal for such knowledge servers is well expressed in this quote from Sir Tim Berners-Lee, the director of the World Wide Web Consortium.
I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines.
While the semantic web is at an early stage of realization, it is being advanced by thousands of computer scientists who are devising formal specifications for the concepts, terms, and relationships in which knowledge might be expressed. The OWL Web Ontology Language and the RDF Resource Description Framework are instances.
Right now the various large repositories of knowledge on the web (like Wikipedia) are like books sitting on the shelves in a library. Soon, as MIT Professor Marvin Minksy suggested, no one will imagine a time when the books on the shelves did not talk with one another.
Here are some early multi-million dollar examples that are paving the way.
WordNet - an on-line semantic lexicon of English that groups words into synonym sets and records the semantic relations between those sets. Begun in 1985 at Princeton by Professor George Miller, the project subserves several large AI projects that attempt automated text parsing and knowledge acquisition.
CYC - Douglas Lenat's famous 25 year long project (now CYCORP) to imbue computers with common sense. It contains two million facts and rules pertaining to about 200,000 entities. The holy grail of the project is self-augmentation in which the knowledge base could expand without human input by reading the web.
Powerset (recently acquired by Microsoft) parses Wikipedia to provide direct answers to queries. Powerset's aim was to unlock the meaning encoded in ordinary human language.
True Knowledge - like Powerset, another search engine start-up company that uses structured knowledge to make deductive inferences to assist human search. Their Answer Engine aims to provide an answer to your query rather than just a collection of websites. To do so, it uses natural language translation and a self-augmenting knowledge base.
While I am modestly interested in efforts to improve search for humans, I am far more interested in imbuing machines with the autonomous ability to understand English and other human languages.
Why? I want them to be hard at work while you and I are sleeping, eating, vacationing, and otherwise off-line. Think what their unbridled creativity might accomplish while we are sleeping. (I do not deny that eventually they might also be up to mischief - like killing or enslaving humanity - but that's a subject for another day.)
My own work at Stanford in the 1970’s and 80’s focused on automated discovery of knowledge. My RX Project used statistical and AI methods to automatically discover medical knowledge. (It did, in fact, discover several important drug side-effects. These had mostly been previously discovered, but RX itself had been given no hints about them.)
And now, for some fun, here's an example from a little story that I wrote about an automated design program called Ralph.
Ralph was developed by INTEL in 2040 to design its next generation chip, enabling it to leap far ahead of Moore's Law. Ralph's task is to design the chips that will serve as the basis for its own upgrade. And I quote —
At 0357 GMT it's trying to crack the loss of quantum coherence problem that plagued the Dodecium40 design team and limited the number of bits in its registers, allowing AMD to stay competitive. Combing the literature for solutions, it finds a promising reference in the Bulgarian Journal of Quantum Computing.
To understand this article, it brings itself up to date on all the precursor literature, and in so doing, sees the solution that was sought after but not actually achieved by the Bulgarian research team.
But will this theoretical insight actually work when the nanotubes hit the road? At Intel's automated research facilities at McMurdo Sound (easy cryo) and in GEO (no funky gravity probs), it robotically performs feasibility studies that are highly promising.
Now, returning to Kelly's article, will we have any indication that the web itself is getting smart? Yes, absolutely! (And in this post I've just singled out one path among many to an autonomous, intelligent web.) Just track the following trend...
dollars flowing into companies that create knowledge servers that autonomously search websites, reading their contents, synthesizing that information, and formulating new knowledge.
It won't be a secret effort, performed conspiratorially by the web itself (bent, perhaps, on creating SkyNet). It will be a very public effort undertaken by dozens of companies that will do it because it will be highly profitable.
Bob Blum — Dec 17 2008