Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Guided Data Capture

Guided Data Capture Software

Current Version

Current Version: GDC 5.0 (September 2007). Users with a previous version should uninstall the old version of GDC before installing the current version.

ThermoML Support: GDC allows export of captured data directly into ThermoML format. This feature is accessed through the File Menu of the main program screen. Users providing output data files to NIST for newly accepted journal articles should continue to provide these in the text output format (*.txt).

GDC 5.0: The new GDC 5.0 provides data-capture capabilities for selected biothermodynamic properties, including properties of enzyme catalyzed reactions, reaction properties determined with titration calorimetry, properties determined with differential scanning calorimetry (DSC), and solubilities in complex media. Background information is provided here. Specific examples for biothermodynamic properties are included in the GDC Help Files.

Contact Information: Please, report any problems with installation or use of the GDC 5.0 software to Rob Chirico at NIST/TRC (chirico [at]


What is the goal of this program?

The goal of the Guided Data Capture (GDC) program is to serve as a bridge between two communities: (1) scientists and engineers involved directly in property measurements, and (2) scientists, engineers, and process designers involved in property utilization in a variety of applications. It is planned to make the submitted data files available to scientists and engineers worldwide in a standardized XML format directly suitable for a variety of applications.

What does the program do?

The program provides software tools to ease capture ("compilation") of "experimental" thermodynamic and transport-property data from the literature for pure compounds, binary and ternary mixtures, and chemical reactions (including change-of-state and equilibrium) under strict data-quality assurance (DQA) guidelines. Reactions with as many as eight participants can be accommodated. The program ensures full specification of properties with traceability of numerical values to the original document.


(This information can also be found in the HELP menu of the main screen of GDC.)

What is the program output?

How much time will this take?

What information is necessary to run the program?

What should I do with the generated "batch data file"?

How does the program work?

What is the program output?

Data are captured in the form of a "batch data file," which is a coded text document containing all captured numerical values and metadata for a given journal article. The file is named automatically by the program based upon the year, the first 3 letters in the first two author names, and a "sort number," (e.g., file name = 2002xxxyyy0.txt). Knowledge of the formats and codes of the batch data file is not required, nor is knowledge related to the structure of a particular data repository.

How much time will this take?

The tools provided should limit the time for complete data entry to not more than 1 hour for most journal articles.

What information is necessary to run the program?

Only the content of a journal article is needed. No additional information is necessary.

How does the program work?

In a series of guided steps you will be asked for the items listed below. Words in ALL CAPS correspond to toolbar labels on the main screen. This list is provided to show the general types of information captured.

Details of specific operations are given in short HELP screens on the relavant forms. First time users are encouraged to make use of these HELP screens.

  1. REFERENCE information (title, authors, keywords, abstract, etc.)
    • An integrated database of author names is included to ease selection
    • Search and selection mechanisms are described on the appropriate form
  2. COMPOUND identifications (name or empirical formula)
    • An integrated database of >100,000 compounds is included to ease selection
    • Search and selection mechanisms are described on the appropriate form
  3. SAMPLE information for each compound (source, purity, purification method, analytical method)
    • Methods may be selected from pre-defined lists or entered directly
  4. MIXTURE component identification (if applicable)
    • Methods may be selected from pre-defined lists or entered directly "Components" are selected from previously identified "compounds"
  5. PROPERTY specification (specification of phases, variables, constraints, etc.)
    • All property, phase, variable, and constraint selections are made from pre-defined lists
    • Brief experimental-method descriptions are entered through pre-defined lists or direct typing
    • Uncertainty estimates for independent variable, constraint, and property values are entered as percentages or absolute values
    • VLE and LLE data are entered with a special DATA TABLES form
  6. Numerical property values
    • See the following paragraph

How do I enter numerical data values?

Numerical data can be entered either by direct typing, or preferably, by means of direct "copy-and-paste" operations with pre-existing tables (HTML, PDF, ASCII, EXCEL, WORD, etc.). The GDC software is designed to minimize manual input (i.e., typing), which not only saves time but also eliminates common errors. Extensive tools are provided for direct incorporation of numerical data from existing electronic files such as HTML, PDF, ASCII, EXCEL, WORD, etc.

What should I do with the generated "batch data file"?

Submission of the batch data file should be done in accordance with the instructions provided by the editorial board of the journal.

HELP files

We have developed a collection of PDF files that provide detailed examples of how to complete the various tasks involved in the data-capture process. All of these tutorials include "screen shots" to familarize you with the program. They cover such operations as downloading and installing the software, capturing bibliographic information, and specifying compounds and properties to be captured. Other help files illustrate data capture for specific properties. This collection of HELP files will be expanded over time to meet the needs of the data community.


(This information can also be found in the HELP menu of the main screen of GDC.)

What chemicals and chemical systems are included?

The primary focus is molecular organic compounds. Some common inorganic compounds are included, such as H2O, O2, N2, CO2, CO, NH3, H2S, SO2, Cl2, Br2, I2, F2, HF, HCl, HBr, and HI, as well as "ionic liquids," organometallic compounds, and molecular biochemicals. Mixtures involving other inorganics with organic compounds may be included.

What chemicals and chemical systems are NOT included?

Polymers, radicals, metals, inter-metallics, salts, metal oxides, etc., and all ionic systems including salt and acid solutions, are not captured.

What properties are captured?

The Complete List of Properties shows all of the properties that can be captured. This information is also provided in the HELP Menu on the main screen of the GDC program. The list includes all common thermochemical, thermophysical, and transport properties.

What types of property values are NOT captured?

The focus of the GDC is properties determined by direct experimental measurement; therefore, most derived properties are not captured. (See below for the derived properties that should be captured.) The following list of property types should NOT be captured.

  • Estimations & correlations (group contributions, corresponding states, etc.)
  • Theoretical calculations (ab initio, semi-empirical, statistical, etc.)
  • Equations & Equation parameters (Antoine constants, interaction parameters, etc.)
  • Spectroscopic properties (vibrational assignments, structures, barriers, etc.)
  • Chemical kinetic data (rate constants, activation energies, etc.)
  • Rheological properties other than viscosity
  • Surface properties other than surface and interfacial tension of liquids
  • Decomposition temperatures, flammability limits, auto-ignition temperatures, octane numbers

Derived property data that SHOULD be captured:

Although the focus of the GDC is properties determined by direct experimental measurement, authors are encouraged to include key derived property values. These are:

  • Azeotropic properties
  • Henry's Law Constants
  • Virial Coefficients (for pure compounds & mixtures)
  • Activities & Activity Coefficients
  • Fugacities & Fugacity Coefficients
  • Standard properties derived from high-precision adiabatic heat-capacity calorimetry {S(T), H(T)-H(0), etc.

GDC ReadMe

Here is a link to an HTML version of the README file that comes with the GDC software. This file addresses various problems that may arise during installation.

DOWNLOAD the GDC software:


Created March 26, 2012, Updated September 21, 2016