Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Human-in-the-loop Technical Document Annotation: Developing and Validating a System to Provide Machine-Assistance for Domain-Specific Text Analysis

Published

Author(s)

Juan Fung, Zongxia Li, Daniel Stephens, Andrew Mao, Pranav Goel, Emily Walpole, Alden A. Dima, Jordan Boyd-Graber

Abstract

In this report, we address the following question: to what extent can machine learning assist a human with traditional text analysis, such as content analysis or grounded theory in the social sciences? In practice, such tasks require humans to review and categorize (e.g., by manually annotating the text with labels) a large sample of documents. We do not expect nor necessarily desire the machine to automate the tasks the human would otherwise perform, but rather want to find ways to help the human to perform the tasks more efficiently. We present a modular implementation of a system that incorporates supervised (active learning) and unsupervised (topic modeling) methods to assist humans with technical document annotation. The implemented system allows us to conduct user studies to evaluate the usefulness of machine assistance. We present results from two such user studies and highlight directions for future research.
Citation
Technical Note (NIST TN) - 2287
Report Number
2287

Keywords

Text analysis, text mining, natural language processing, machine learning, human-in-the- loop, social science, content analysis.

Citation

Fung, J. , Li, Z. , Stephens, D. , Mao, A. , Goel, P. , Walpole, E. , Dima, A. and Boyd-Graber, J. (2024), Human-in-the-loop Technical Document Annotation: Developing and Validating a System to Provide Machine-Assistance for Domain-Specific Text Analysis, Technical Note (NIST TN), National Institute of Standards and Technology, Gaithersburg, MD, [online], https://doi.org/10.6028/NIST.TN.2287, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=957665 (Accessed October 14, 2024)

Issues

If you have any questions about this publication or are having problems accessing it, please contact reflib@nist.gov.

Created May 14, 2024