Document and Passage Retrieval Based on Hidden Markov Models.
Elke Mittendorf, Peter Schäuble:
Document and Passage Retrieval Based on Hidden Markov Models.
Introduced is a new approach to Information Retrieval developed on the basis of Hidden Markov Models (HMMs).
HMMs are shown to provide a mathematically sound framework for retrieving documents - documents with predefine boundaries and also entities of information that are of arbitrary lengths and formats (passage retrieval).
Our retrieval model is shown to encompass promising capabilities: First, the position of occurrences of indexing features can be used for indexing.
Positional information is essential, for instance, when considering phrases, negation, and the proximity of features.
Second, from training collections we can derive automatically optimal weights for arbitrary features.
Third, a query dependent structure can be determined for every document by segmenting the documents into passages that are either relevant or irrelevant to the query.
The theoretical analysis of our retrieval model is complemented by the results of preliminary experiments.
Printed Edition
W. Bruce Croft, C. J. van Rijsbergen (Eds.):
Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Dublin, Ireland, 3-6 July 1994 (Special Issue of the SIGIR Forum).
ACM/Springer 1994, ISBN 3-540-19889-X
