Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

An Introduction to the Common Data Format Project

READ-ONLY SITE MATERIALS: Historical voting TWiki site (2015-2020) ARCHIVED from https://collaborate.nist.gov/voting/bin/view/Voting

This page presents an introduction to the common data format (CDF) project and answers the question, why is a CDF needed? It starts with a discussion of the proprietary nature of current election equipment data formats and the resultant problems this creates, and then discusses the advantages to using a common data format and how this improves efficiency, reduces cost, and increases the overall transparency of election equipment and, ultimately, elections. It concludes with an overview of the NIST/Interoperability Working Group's plan for building a set of CDF specifications.

Why is a CDF Needed for Election Equipment?

Advantages to Using a CDF

Plan for Building CDF Interoperability

Why the Need for a CDF for Election Equipment

The need for a common data format is analogous to the use of a common language for people and economies to share the best of ideas, products, and services. A language used exclusively by a few isolates people from the rest of what the world has to offer. As the demand and use of technology increases in elections, a variety of new products are being used by election officials that must be able to “talk” with each other (i.e. share data) or talk with a common host in order to integrate them into the entire election administration process. Since the "data language" used by these products tends to be proprietary and doesn’t communicate with products from another manufacturer, election officials can be limited to the voting systems product line available through the manufacturer they already have a relationship with.

In the elections marketplace, this has several disadvantages:

  • Election officials can be "locked" into a single manufacturer’s product line by decisions made years ago in their jurisdiction when the needs were different. The cost of converting the jurisdiction’s entire product line to another manufacturer and the top to bottom change in procedures often required could be prohibitive.
  • Election officials may not have an opportunity to shop for the more appropriate product to meet their needs if they are limited to using only the products offered by their current manufacturer.
  • Smaller companies that might focus on a single product can be locked out because of the lack of interoperability with manufacturer product lines.
  • The addition of newer devices such as online blank ballot distribution or tablet technology is complicated by projects having to define its own data format or use a format that is proprietary to an existing manufacturer.

Without a CDF, duplication of effort, greater risk of getting things wrong, and re-invention of the wheel can occur. Several examples suffice:

Example 1:
A jurisdiction’s voter registration system and candidate filing system may both contain the information needed by its ballot layout system when it comes time to create a new election. These are actually "families" of equipment, and due to a lack of a CDF between the two families, jurisdictions that import/export information between them have commonly had to expend their own resources to create a sort of "translation service" between the two that is unique to that jurisdiction’s situation. Or, some jurisdictions may have to duplicate their data entry by re-creating the information in the ballot layout system more or less manually. Obviously, a CDF between these families of equipment would largely eliminate this duplication of effort and reduce the potential for mistakes.

Example 2:
During an election, the vote-capture devices such as DREs, POS, and CCOS (Direct-Record Electronic, Precinct Optical Scanner, and Central-Count Optical Scanner respectively) communicate the votes they gather to a common EMS (Election Management System) where the votes are tabulated and reported. Prior to its use in the election, a common EMS communicates a complex set of ballot configurations to these devices that define the ballot styles of each vote the device will be used to gather. Typically the EMS that creates the ballot styles and communicates the ballot styles to the devices is the same system that receives the votes gathered by the devices. Without a common data format and given that the communication protocol, data structure, and data elements are unique to each manufacturer, the DRE, POS, and CCOS in a jurisdiction will generally be products developed, marketed, and/or integrated by the same manufacturer who created the EMS. If a jurisdiction wants to use a different DRE, or POS, or CCOS than the one provided by its manufacturer, it may be very costly to integrate the product. The same issue occurs with EMSs; a state must use the same manufacturer throughout the state or else must import/export tabulated results to intermediate formats.

Example 3:
A number of states, motivated by the MOVE (Military and Overseas Voter Empowerment) Act's requirements and the need to deliver blank ballots to voters located overseas in ample time for the election, are now fielding blank ballot distribution systems (BBDS) to allow voters to download their respective ballots; some states are also experimenting with using tablets (e.g., the iPad) to assist in blank ballot delivery. These technologies each need to "speak" to other parts of the voting system, for example the BBDS needs to access ballot information stored in the EMS and voter/precinct information stored in the VRDB (Voter Registration Database). However, without a common data format, the data format either needs to be invented for that particular project, or the data format will be the same proprietary format as one of the manufacturer partners, e.g., for example, the data format used by the EMS. This typically requires a business arrangement between the project developer and the owner of the proprietary format.

Advantages to Using a CDF

There are numerous advantages to using a common data format in election equipment and associated software and systems. Perhaps the best answer is because it makes the devices easier to use, deploy, and understand. Ultimately, it can make them less complex to administer and understand, and the resultant reduced complexity may then lead to greater trust of the voting devices. Thus, a CDF is foundational in a number of ways for improving current voting systems and for making it possible to develop new voting technologies in an efficient, orderly way. It follows from the previous discussion that use of a CDF brings the following advantages:

  • Anyone can build or sell a device; no manufacturer gets locked out of the market.
    Support of a CDF in manufacturer equipment results in interoperability of data format and permits new manufacturers to sell equipment to states or jurisdictions where they were formerly locked out. It allows small manufacturers to build one-off devices as opposed to having to build a complete suite of products.
  • Election officials are empowered to buy whatever devices best suit their needs.
    When there is interoperability, election officials can then shop for the devices that best suit the needs of the voters, regardless of manufacturer. If, for example, a particular accessible voting device is deemed better for a particular state, election officials can now use this system regardless of who has manufactured it.
  • Software and new system developers can write applications that make use of the CDF.
    The CDF specifications are freely available, and developers/integrators of new equipment and software can use them to interface to other manufacturer equipment. This prevents the continual "re-invention of the wheel" that occurs when new systems must develop their own format.
  • Elections can be audited and analyzed and archived more easily.
    Voting devices store a number of data elements that are important and useful for election audit and analysis, but these items are sometimes not easily accessible due to their proprietary formats, e.g., event logs. When a CDF export format is provided for a class of voting devices, manufacturers can then build in the export capability for these elements. In other words, "Build it and they will come."
  • Device certification is possible.
    The EAC certifies voting systems, that is, complete systems of devices to run an election. Certifications can be as expensive as $2 to $4 million and may take several years. If a state wishes to use a new device in a certified voting system, it may "break" the certification because of the resultant changes that would need to occur in order to make the new device operate with the other devices in the system. With interoperability, however, a device itself could be certified and used in an existing voting system without breaking the certification.
  • Voting equipment testing is easier with common formats and imports/exports.
    When devices have a common import/export format capability, tests can be made more uniform and devices can more easily be tested against common collections of data. Outputs from devices can be analyzed with more consistency.

Plan for Building CDF Interoperability

Achieving the goal of election device-to-device interoperability, using a CDF in part as the means, is complicated and requires the involvement and cooperation of many parties. Given this, NIST and the Interoperability working group have developed a project plan that consists of a series of CDF specifications that are modeled after typical functions in elections and where interoperability between the functions and devices involved would be advantageous. This strategy involves addressing the “lower hanging fruit” functions that are at voting system boundaries to external systems, e.g., for imports from voter registration systems or for reporting election results, and then subsequently working towards those functions and devices that are within the voting system and that would achiever more device-to-device interoperability, e.g., cast vote records exported from scanners and EMS and imported into tabulators and audit devices. Addressing these standards in parallel as opposed to serially allows more flexibility and capability to take advantage of external assistance or collaboration with other interested parties or coalitions, most notably the Pew/Google VIP project and other projects underway with states and counties. NIST's plan is to issue each specification as a NIST Special Publication 1500 series document.

I

An Introduction to the Common Data Format Project image 01

The figure above shows a diagram of common election subsystems and how they are typically related, as well as the primary classes of data processed by the subsystems. The common data format work generally maps to the subsystems in this diagram, but may possibly overlap one or more of the subsystems or may target only a subset of the data processed by a subsystem.
 


Voting TWiki Archive (2015-2020): read-only, archived wiki site, National Institute of Standards and Technology (NIST)


ARCHIVE SITE DESCRIPTION AND DISCLAIMER

This page, and related pages, represent archived materials (pages, documents, links, and content) that were produced and/or provided by members of public working groups engaged in collaborative activities to support the development of the Voluntary Voting System Guidelines (VVSG) 2.0. These TWiki activities began in 2015 and continued until early 2020. During that time period, this content was hosted on a Voting TWiki site. That TWiki site was decommissioned in 2020 due to technology migration needs. The TWiki activities that generated this content ceased to operate actively through the TWiki at the time the draft VVSG 2.0 was released, in February of 2020. The historical pages and documents produced there have been archived now in read-only, static form.

  • The archived materials of this TWiki (including pages, documents, links, content) are provided for historical purposes only.
  • They are not actively maintained.
  • They are provided "as is" as a public service.
  • They represent the "work in progress" efforts of a community of volunteer members of public working groups collaborating from late 2015 to February of 2020.
  • These archived materials do not necessarily represent official or peer-reviewed NIST documents nor do they necessarily represent official views or statements of NIST.
  • Unless otherwise stated these materials should be treated as historical, pre-decisional, artifacts of public working group activities only.
  • NIST MAKES NO WARRANTY OF ANY KIND, EXPRESS, IMPLIED OR STATUTORY, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT AND DATA ACCURACY.
  • NIST does not warrant or make any representations regarding the correctness, accuracy, reliability or usefulness of the archived materials.

ARCHIVED VOTING TWIKI SITE MATERIALS

This wiki was a collaborative website. NIST does not necessarily endorse the views expressed, or concur with the facts presented on these archived TWiki materials. Further, NIST does not endorse any commercial products that may be mentioned in these materials. Archived material on this TWiki site is made available to interested parties for informational and research purposes. Materials were contributed by Participants with the understanding that all contributed material would be publicly available.  Contributions were made by Participants with the understanding that that no copyright or patent right shall be deemed to have been waived by such contribution or disclosure. Any data or information provided is for illustrative purposes only, and does not imply a validation of results by NIST. By selecting external links, users of these materials will be leaving NIST webspace. Links to other websites were provided because they may have information that would be of interest to readers of this TWiki. No inferences should be drawn on account of other sites being referenced, or not referenced, from this page or these materials. There may be other websites or references that are more appropriate for a particular reader's purpose.

 

Created August 28, 2020, Updated February 5, 2021