Physics
Course in Information Retrieval


Before start searching
How do AltaVista and Infoseek work?
Interesting places

Before start searching

Internet - it is nowadays a few hundreds of thousands servers, located all over the world, which store thousands of gigabits of data. It sounds very optimistic, because all data is available for us - the computer users and we can get this valuable information not only in the form of text, but also in the form of photographs, graphs, sound recordings or animations. But its real value we can see when we connect a server, where information is stored and when we read it. You have to know that when you start using the search engines for viewing the most popular Internet service what is world wide web - commonly called www, it can take you some time to find the pages of your interests. Sometimes your search can be very arduous and time-consuming (it can last longer because of connections of bad quality), but you can be sure of successful results -negative answers are very seldom - among the showy and well-prepared sites you can also find sites not so attractive, but very interesting in their contents...

Everybody, who wants to use www sources effectively, has to know how to control and use various search engines. We have to be aware of Internet properties: its dynamism and changeability. Every day thousands of new pages appear, and the same large number of pages disappear. Publications, which you found yesterday, next day may not exist on Internet or their addresses can be changed - you have to be aware of this when preparing your presentation.

Along with the growing mass of information on Internet, the number of information retrieval tools is also increasing. As mentioned above, knowledge about Internet search engines or indexes of www resources may be often the only way of obtaining useful information so widely scattered in the network.

There are hundreds of specialised servers working on Internet, which aim is only to catalogue and to index the network sources. Nowadays the most popular tools for information searching in www sources are such search engines like: AltaVista, Infoseek, HotBot, Excite. They work on a base of special programs called robots crawlers, which every night (or e.g. once a week) dig through the thousands of Internet sites, finding new links and copying new pages. The contents of these pages are indexed, so you can search under any word from the text. Search engines use also the logic operators, and usually distinguish the capital and small letters. In case of AltaVista and Infoseek it is possible to search according to the dates of last document updating.

After keying in your enquiry - each search engine has a special field, where you should type it - the program is searching the resources of its index seeking the links (references to pages), which contents should be relevant to the given enquiry. In answer you will get well-arranged list of references to the documents, which the program (you have to remember that everything is being done mechanically) considers as these ones matching your search criteria. Each reference is usually provided with a short description of the document. It is not bibliographical description, it does not look like it, because it is only the document title, page title or the beginning of the text. Individual search engines have their own systems for structuring (arranging) the answers, they often take into the consideration the frequency of the words used for your search in relation to all words indexed in the data base.

In answer we usually get a very long list of references. To avoid the arduous reviewing the unnecessary sites, it is worth to use the relevant commends precising your searching.

 


Your comments        Subject Librarians        Webmaster        Polish Version        Update : March 1999