Nestor: A Tool for Natural Language Annotation of Short Texts

Published: November 01, 2019

Author(s)

Thurston B. Sexton, Michael P. Brundage

Abstract

Nestor is a software tool that annotates natural language CSV (comma-separated variable) files, with a UTF-8 encoding, using a process called tagging [1]. The outputted annotated datasets (as either a CSV or .h5 file) can be used for different analysis techniques, such as failure prediction, problem hot spot identification, and maintenance technician expertise assessment, as shown in [2-7]. Currently, the majority of use cases involve maintenance in the engineering domain (manufacturing, mining, heating ventilation and air conditioning (HVAC)), however Nestor can input any natural language CSV file with UTF-8 encoding. The objective is to help analysts make their natural language data, which is often unstructured, filled with technical content, jargon, mispellings, and abbreviations, computable to improve analysis.
Citation: Journal of Research of the National Institute of Standards and Technology
Pub Type: Journals

Download Paper

Keywords

manufacturing, natural language processing, maintenance, software, nestor
Created November 01, 2019, Updated November 01, 2019