F1 Information Retrieval Techniques
     Jadranka Lasic-Lazic, Sanja Seljan, Hrvoje Stancic, Faculty of Philosophy, Zagreb

  Abstract   Full Paper html pdf   Presentation   Back to Program    

There is currently huge amount of data on the web and almost no classification information. The key problem is how to embed knowledge into information mining algorithms. The authors analyze techniques of information retrieval and give their strong and weak points. Although most Web documents are text-oriented, there are plenty of them that contain multimedia elements, which are not easily accessible through common search methods. Web information is dynamic, semi-structured, and interwound with hyperlinks. Several advanced methods for Web information mining are analyzed: (1) syntax analysis (HTML tags), (2) knowledge annotation by use of conceptual graphs, (3) KPS: Keyword, Pattern, Sample search techniques, and (4) techniques of obtaining descriptions by fuzzification and back-propagation. The problem of choosing proper keywords is also stressed out. The authors suggest the usage of already accepted standards for classification hierarchy, such as Dewey Decimal Classification (DDC).