Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Employing Word-Embedding for Schema Matching in Standard Lifecycle Management

Published

Author(s)

Hakju Oh, Boonserm Kulvatunyou, Albert T. Jones, Tim Finin

Abstract

Today, businesses rely on numerous information systems to achieve their production goals and improve their global competitiveness. Semantically integrating those systems is essential for businesses to achieve both. To do so, businesses must rely on standards, the most important of which are data exchange standards (DES). DES focus on technical and business semantics that are needed to deliver quality and timely products and services. Consequently, the ability for businesses to quickly use and adapt DES to their innovations and processes is crucial. Traditionally, information standards are managed and used 1) in a platform-specific form and, 2) usually with standalone, file-based applications. These traditional approaches no longer meet today's business and information agility needs. For example, businesses now must deal with companies and suppliers that use heterogeneous syntaxes for their information. Syntaxes that are optimized for individual, but different, objectives. Moreover, file-based standards and the usage specifications derived from the standards cause inconsistencies since there is neither a standard format the usage specifications nor a single source of truth for all of them. As the number and types of information systems grow, developing, maintaining, reviewing, and approving standards and the derived usage specifications are becoming more difficult and time consuming. Each file-based, usage specification is typically based on a different syntax than the standard syntax. As a result, each usage specification must be manually updated as the standard evolves; this can cause significant delays and costs in adopting the new and better standard versions. National Institute of Standards and Technology (NIST) in collaboration with the Open Application Groups Inc. (OAGi) has developed a web-based standard lifecycle management tool called SCORE to address these problems. This paper discusses a particular functionality in the SCORE tool where a word-embedding technique has been employed along with other schema matching approaches. The together they can assist standard users in updating the usage specifications due to the release of a new version of a standard and in harmonizing multiple standards.
Citation
Journal of Industrial Information Integration
Volume
38

Keywords

data exchange standard life cycle management, word embedding, AI, word2vec

Citation

Oh, H. , Kulvatunyou, B. , Jones, A. and Finin, T. (2023), Employing Word-Embedding for Schema Matching in Standard Lifecycle Management, Journal of Industrial Information Integration, [online], https://doi.org/10.1016/j.jii.2023.100547, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=936321 (Accessed April 20, 2024)
Created December 29, 2023, Updated February 26, 2024