René Nyffenegger's collection of things on the web
René Nyffenegger on Oracle - Most wanted - Feedback -
 

What's wrong with search engines

Currently, the major Internet search engines search for documents that contain information rather for the information itself. But I wish search engines searched for information, not for documents.
In order to illustrate what I mean lets say the internet has five documents (html pages) about the sun. Here are those five documents' content:

The first document:
sun.html

Sun's diameter: 1,390,000 km.
Sun's mass: 1.989. 1030 kg.
Rotates every 25.4 days (at equator).


The second document:
our_sun.html

Sun's surphase called photosphere
Largest object in solar system.
Chromosphere above photosphere.


The third document:
facts_about_sun.html

Rotates every 25.4 days (at equator).
Magnetosphere extends beyond Pluto.
70% hydrogen, 28% helium, <2% metals


The fourth document:
facts_sun.html

Chromosphere at temperature of 5800 K.
Largest object in solar system.
Rotates every 25.4 days (at equator).


And lastly, the fifth document:
the_sun.html

Magnetosphere extends beyond Pluto.
Sun's diameter: 1,390,000 km.
70% hydrogen, 28% helium, <2% metals

The other documents (in the internet) make no reference about the sun.
Now, the 'internet' lists the following seven facts about the sun:
  • Diameter
    found in two documents.
  • mass
    found in one document.
  • Rotation speed
    found in three documents
  • largest obects in solar system
    found in two documents.
  • Surphase (chromosphere) at temperature of 5800 K.
    deducted from two documents.
  • composition of sun
    found in two documents.
  • Extension of Magnetosphere
    found in two documents.
I want my search engine to return these facts when I enter sun. However, currently, search engines return the five documents. Each document lists three facts. So, I have to read 5 times 3 (=15) facts in order to learn 7 facts. That is, I have to read twice as much information as necessary.
I hope, future search engines can do that work for me.