Go to the NIST home page Go to the TAC Home Page TAC Banner

Return to TAC Homepage

TAC 2015 Tracks
    Cold Start KBP
        Guidelines
        Tools
        Schedule
    Tri-Lingual EDL
    Event
    Validation/Ensembling
Data
Schedule
Organizers
Track Registration
Reporting Guidelines
TAC 2015 Workshop




TAC 2015 Cold Start KBP Track

Overview

The Cold Start KBP track builds a knowledge base from scratch using a given document collection and a predefined schema for the entities and relations that will compose the KB. The Cold Start schema consists of three entity types (PER, ORG, GPE) and binary relation types corresponding to KBP slots (and their inverse slots).

In addition to end-to-end KB construction, the Cold Start track offers two diagnostic tasks: Entity Discovery (ED) and Slot Filling (SF). The Cold Start entity discovery task is to create a KB node for each PER, ORG, and GPE entity mentioned by name in the Cold Start collection of text, and to link all named mentions of the entity to its KB node. The Cold Start slot filling task is to search the same document collection to fill in values for specific slots for specific entities.

Tasks

For all three Cold Start tasks, systems will operate on the same set of approximately 50K English documents consisting of newswire and discussion forum posts. The Cold Start KBP track evaluates systems along two dimensions: slot filling, and entity discovery. Cold Start KB Construction systems and Cold Start SF systems are evaluated on the slot filling dimension using the same set of slot filling queries; and Cold Start KB Construction systems and Cold Start ED systems are evaluated on the entity discovery dimension using the entity mentions and links in the same set of documents.

The full 2015 Cold Start KB Construction task is identical to the KB variant of the Cold Start 2014 task. Given a collection of documents, the Cold Start KB Construction system must find all PER, ORG, and GPE entities mentioned by name in the collection, create a KB node for each entity, and link each name mention to the correct entity node in the KB. All Cold Start relations that are attested in the document collection must be included in the KB, along with a confidence value and provenance consisting of a document and offsets for the subject, predicate, and object of the relation. For relations whose objects are themselves Cold Start entities (such as per:siblings or org:subsidiaries), systems must link to the KB node representing the correct entity; relations whose objects are strings (such as per:title or org:website) will use strings to represent the object. If a KB includes multiple instances of the same subject-predicate-object triple (but with different provenance information), the triple with the highest confidence value will be evaluated, and additional triples with confidence lower value may be evaluated as resources permit.

Systems partipating in the Cold Start KB Construction task are evaluated on two dimensions: slot-filling, and entity discovery.

  • Entity Discovery Evaluation: Several hundred documents from the 50K input documents will be selected, and their entity mentions and entity nodes will be evaluated using the clustering metric of the KBP 2014 Entity Discovery and Linking Track.
  • Slot Filling Evaluation: An evaluation query consists of a named entity mention in the document collection (an "entry point"), and a sequence of one or more relation types. The entry point selects a single corresponding entity node in the KB, and the sequence of relation types is followed to arrive at a set of terminal slot fillers at the end of the sequence. The terminal slot fillers are then assessed and scored as in the English slot filling task. For example, a typical query may ask "What are the ages of the siblings of the Bart Simpson mentioned in Document 42?" or "What schools are attended by the children of the Marge Simpson mentioned in Document 9?" Such "two-hop" queries will verify that the knowledge base is well-formed in a way that goes beyond basic entity linking and slot filling. Each evaluation query may have multiple entry points (i.e., multiple mentions of the same entity), in order to mitigate cascaded errors caused by submitted KBs that are not able to link every name mention to a KB entity node.

The 2015 Cold Start Slot Filling (SF) task removes the requirement that an entire text collection must be processed. Instead, Cold Start SF participants will receive the evaluation queries, and need only produce those entities and relations that would be found by the queries. A TAC slot filling system can easily be applied to this task by running initially from each evaluation query entry point, then recursively applying the slot filler to the identified entities. The 2015 Cold Start Slot Filling task is essentially the same as the Slot Filling variant of Cold Start 2014, except that systems must now explicitly return the entity type of the slot filler (PER, ORG, or GPE).

The 2015 Cold Start Entity Discovery (ED) task is to create a KB node for each PER, ORG, and GPE entity mentioned by name in the Cold Start collection of text, and to link all named mentions of the entity to its KB node. Cold Start ED is essentially the same as the TAC KBP 2014 English Entity Discovery and Linking task, except that all entity mentions are clustered rather than linked to an existing reference KB. Also, a larger number of documents must be processed (~50K), though scoring will consider only several hundred of those documents.

Preliminary Schedule

    April 15Track registration opens
    Mid AprilTrack guidelines posted
    June 30Deadline for registration for track participation
    August 3-31Cold Start KBP evaluation window
    By mid OctoberRelease of individual evaluated results to participants
    October 20Deadline for short system descriptions
    October 20Deadline for workshop presentation proposals
    October 25Notification of acceptance of presentation proposals
    Nov 8Deadline for system reports (workshop notebook version)
    November 16-17TAC 2015 workshop in Gaithersburg, Maryland, USA
    March 5, 2016Deadline for system reports (final proceedings version)

Track Coordinators

James Mayfield (Johns Hopkins University, jamesmayfield@gmail.com)
Ralph Grishman (New York University, grishman@cs.nyu.edu)


NIST is an agency of the
U.S. Department of Commerce

privacy policy / security notice / accessibility statement
disclaimer
FOIA

Last updated: Tuesday, 28-Mar-2017 11:22:18 EDT
Comments to: tac-web@nist.gov