A knowledge management system includes: a default knowledge system including: a knowledge system and a knowledge database in communication with the knowledge system; and a knowledge store in communication with the default knowledge system and including: a taxonomy amendment, an annotation amendment, a canonicalization amendment, an ecosystem amendment, a term amendment, and a phrase amendment.
This method is a general approach to generating sets of normalized terminology from a digital corpus of natural language documents in any given domain to address the need for flexible, intuitive, reusable, and normalized terminology. The terms that this approach generates are root- and rule-based terms, generated by a series of rules designed to be flexible, to evolve, and, perhaps most importantly, to minimize ambiguity and reduce semantically similar but syntactically distinct phrases to a normal form. This approach combines several linguistic and computational methods that can be automated with the help of training sets to quickly and consistently extract normalized terms. The proposed approach differs from traditional term-based approaches to document indexing by the use of normalized term-structure from noun phrases extracted from the text. The rules-based approach developed in this method was illustrated using a growing corpus of materials science document repository at NIST to create a normalized terminology for the domain. The use of our method results in common, consistent, and evolving set of rules for creating or extending terminology as needed to describe materials information base. The rules are intended to be simple and generalizable for use in other document repositories to create cross repository indexing of documents with the normalized terms. The rules generate terms that facilitate machine processing of the corpus to enhance targeted information retrieval through search and development of domain-based taxonomy of terms.
Refer to Figure 1 below. The knowledge management system manages knowledge and includes default knowledge system 210. Default knowledge system 210 includes knowledge system 228 and knowledge database 232 in communication with knowledge system 228. Also, knowledge management system includes knowledge store 212 in communication with default knowledge system 210. Knowledge store 212 includes a taxonomy amendment store 250, an annotation amendment store 252, a canonicalization amendment store 254, an ecosystem amendment store 256, a term amendment store 258, and a phrase amendment store 259.
Knowledge management system further can include network 216 in communication with default knowledge system 210, input device 220 in communication with default knowledge system, output device 224 in communication with default knowledge system, or a combination thereof. It is contemplated that default knowledge system default knowledge system 210 can include operating system 236 in communication with knowledge system 228 and knowledge database knowledge database 232.
Current methods rely on term-based indexing and search and thereby limiting the result of the search. This method disclosed uses automated phrase based normalized terminology centered indexing of terms thereby providing more semantically guided search. This approach is unique as it presents the terms searched in the context of appearance to guide and disambiguate the search. No commercially available tool takes this approach to indexing and searching for documents.