NOTICE: Due to a lapse in annual appropriations, most of this website is not being updated. Learn more.
Form submissions will still be accepted but will not receive responses at this time. Sections of this site for programs using non-appropriated funds (such as NVLAP) or those that are excepted from the shutdown (such as CHIPS and NVD) will continue to be updated.
An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
ATLAS: A Flexible and Extensible Architecture for Linguistic Annotation
Published
Author(s)
S Bird, D Day, John S. Garofolo, J Henderson, Christophe Laprun, M Liberman
Abstract
We describe a formal model for annotating linguistic artifacts, from which we derive an application programming interface (API) to a suite of tools for manipulating these annotations. The abstract logical model provides for a range of storage formats and promotes the reuse of tools that interact through this API. We focus first on Annotation Graphs, a graph model for annotations on linear signals (such as text and speech) indexed by intervals, for which efficient database storage and querying techniques are applicable. We note how a wide range of existing annotated corpora can be mapped to this annotation graph model. This model is then generalized to encompass a wider variety of linguistic signals including both naturally occurring phenomena (as recorded images, video, multi-modal interactions, etc.) as well as the derived resources that are increasingly important to the engineering of natural language processing systems (such as word lists, dictionaries, aligned bilingual corpora, etc.). We conclude with a review of the current efforts towards implementing key pieces of this architecture.
Proceedings Title
International Conference on Language Resources and Evaluation | 2nd | | European Language Resources and Evaluation
Volume
3
Conference Dates
May 1, 2000
Conference Location
Athens, 1, GR
Conference Title
International Conference on Language Resources and Evaluation
Bird, S.
, Day, D.
, Garofolo, J.
, Henderson, J.
, Laprun, C.
and Liberman, M.
(2000),
ATLAS: A Flexible and Extensible Architecture for Linguistic Annotation, International Conference on Language Resources and Evaluation | 2nd | | European Language Resources and Evaluation, Athens, 1, GR, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=151499
(Accessed October 4, 2025)