Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Natural Language Processing and Information Retrieval

Published

Author(s)

Ellen M. Voorhees

Abstract

Information retrieval addresses the problem of finding those documents whose content matches a user's request from among a large collection of documents. Currently, the most successful general purpose retrieval methodsare statistical methods that treat text as little more than a bag of words. However, attempts to improve retrieval performance through more sophisticated linguistic processing have been largely unsuccessful. Indeed, unless done carefully, such processing can degrade retrieval effectiveness.Several factors contribute to the difficulty of improving on a good statistical baseline including: the forgiving nature but broad coverage of the typical retrieval task; the lack of good weighting schemes for compound index terms; and the implicit linguistic processing inherent in the statistical methods. Natural language processing techniques may be more important for related tasks such as question answering or document summarization.
Citation
Summer School on Information Extraction

Keywords

information retrieval, natural language processing, text retrieval

Citation

Voorhees, E. (1999), Natural Language Processing and Information Retrieval, Summer School on Information Extraction, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=151418 (Accessed April 19, 2024)
Created July 13, 1999, Updated February 17, 2017