Organizing Tagged Knowledge: Similarity Measures and Semantic Fluency in Structure Mining

Thurston B. Sexton; Mark Fuge

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

Organizing Tagged Knowledge: Similarity Measures and Semantic Fluency in Structure Mining

Published

January 3, 2020

Author(s)

Thurston B. Sexton, Mark Fuge

Abstract

Recovering a system's underlying structure from its historical records (also called structure mining) is essential to making valid inferences about that system's behavior. For example, making reliable predictions about system failures based on maintenance work-order data requires determining how concepts described within the work order are related. Obtaining such structural information is challenging, requiring system understanding, synthesis, and representation design. This is often either too difficult or too time-consuming to produce. Consequently, a common approach to quickly eliciting tacit structural knowledge from experts is to gather uncontrolled keywords as record labels-i.e., "tags." One can then map those tags to concepts within the structure and quantitatively infer relationships between them. Existing models of tag similarity tend to either depend on correlation strength (e.g. overall co-occurrence frequencies), or on conditional strength (e.g. tag sequence probabilities). A key difficulty in applying either model is understanding under what conditions one is better than the other for overall structure recovery. In this paper, we investigate the core assumptions and implications of these two classes of similarity measures on structure recovery tasks. Then, using lessons from this characterization, we borrow from recent psychology literature on semantic fluency tasks to construct a tag similarity measure that emulates how humans recall tags from memory. We show through empirical testing that this method combines strengths of both common modeling paradigms. We also demonstrate its potential as a pre-processor for structure mining tasks via a case study in semi-supervised learning on real excavator maintenance work-orders.

Citation

Journal of Mechanical Design

Pub Type

Journals

Download Paper

Local Download

Keywords

network recovery, graph theory, maintenance, random walk

Manufacturing systems design and analysis, Information retrieval, Human language technology, Data and informatics and Complex systems

Citation

Sexton, T. and Fuge, M. (2020), Organizing Tagged Knowledge: Similarity Measures and Semantic Fluency in Structure Mining, Journal of Mechanical Design, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=928903 (Accessed July 30, 2026)

Additional citation formats

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created January 3, 2020, Updated March 6, 2020

Was this page helpful?