Human Factors Test Suite Version 2.0-1 for the Usability, Accessibility, and Privacy Requirements of the VVSG-NI, Version 2.0

Introduction

The purpose of this document is to describe specific test methods for all the Usability and Accessibility requirements within the VVSG (Part 1, Chapter 3). For each such requirement, there are instructions for how a test lab (or any other testing agent) should go about determining whether or not the voting system under test (VSUT) meets that requirement. (Of course, as with all conformance testing, one cannot be certain that a given system meets the requirements in all circumstances, only that the VSUT is successful under the particular conditions actually tested. However, a failed test does constitute proof that the VSUT does not meet the requirement.)

Intro.1 Background of VVSG Testing

By authorization of the 2002 Help America Vote Act (HAVA), NIST is assisting the Election Assistance Commission (EAC) with the implementation of Voluntary Voting System Guidelines (VVSG) for states and local governments conducting Federal elections. The EAC�s Technical Guidelines Development Committee (TGDC) in collaboration with NIST researchers has developed a draft of the next iteration of the VVSG. The draft document is a set of detailed technical requirements addressing core requirements, human factors, privacy, security, and transparency of the next generation of voting systems. The EAC plans to issue the next VVSG after receiving and reviewing public comments.

NIST is developing a set of uniform public test suites to be used as part of the EAC�s Testing and Certification Program. Test Labs will be able to use these freely available test suites to help determine that VVSG requirements are met by voting systems. The test suites address human factors, security and core functionality requirements for voting systems as specified in the VVSG. Use of the public test suites will produce consistent results and promote transparency of the testing process. The test suites can also assist manufacturers in the development of conforming products by providing precise test specifications. Also, they will help reduce the cost of testing since each test lab would no longer need to develop its own test suites. Finally, a uniform set of public test suites can increase election officials� and voters� confidence that voting systems conform to VVSG requirements.

Intro.2 Structure of this Document

Following this introductory section, Part 1 lists each requirement number and title, followed by either a Single Requirement Test Method (SRTM) or a Combined Requirement Test Method (CRTM). An SRTM directly describes the way in which its requirement is to be tested.

However, it is often much easier to test a group of requirements together in a single test scenario. In those cases, the requirements within the group all have a link pointing to the CRTM that they share. Part 2 of this document describes those CRTMs. Each CRTM description includes a list of the requirements it covers. In a few cases, a requirement may be tested in more than one CRTM; if so, it will have a list of links to point to all the relevant CRTMs. Thus, there is a many-to-many relation between the requirements and CRTMs.

Troughout this document, we use "test method" to mean either an SRTM or a CRTM.

If you are using the interactive HTML version of this document, here are the navigation rules:

A link that is a requirement number (e.g. "3.2.2-A") will jump to the entry in Part 1 for that requirement. A link that is a requirement title (e.g. "Ballot Editing per Contest") will jump to the entry for that requirement in the VVSG itself, either in a separate window tab or a separate window (depending on your browser settings). A link that is a CRTM title (e.g. "Editable Ballot Session") will jump to the entry in Part 2 for that CRTM. Also, notice that there is a table of contents located at the beginning of both Part 1 and Part 2.

Intro.3 Tester Qualifications

All the tests require general familiarity with voting systems and procedures, with conformance testing, with the requirements of the VVSG, and with usability and human factors. Certain tests also require special expertise (such as the operation of technical equipment). When such expertise is called for, it will be noted explicitly within the test method.

Furthermore, since many of the tests involve the tester acting as a voter, the tester should have no serious perceptual or cognitive disabilities, and must be fully literate in English. In particular, the tester's corrected vision must be no worse than 20/40.

Intro.4 Measurement Equipment Needed

Various tests require certain kinds of technical equipment. Among these are:

Small ruler
15x magnifier
Tape measure
Stopwatch
Level
Oscilliscope
Photometer
A broadband instrument for measuring sound volume, using an A-weighting filter (as per IEEE 269)
Force gauge

Intro.5 General Rules and Background Assumptions for Testing

Intro.5.1 Rules for All VVSG Testing

The following principles apply to all of the VVSG tests.

Read the VVSG: The full wording of requirements and accompanying discussions is not repeated herein. It is assumed that as the test lab proceeds trough the test procedures, it is consulting the official VVSG text of the requirements being addressed. The test procedures cannot be correctly understood in isolation from the underlying VVSG.
Use of Judgment: Although the purpose of this document is to lay out defined and repeatable procedures for testing a voting system against the VVSG, the task of determining conformance is not one that can always be done "mechanically". The tester may need to apply reasoned judgment when performing the testing, taking into account the general meaning and purpose of the requirement under test.
Significant Difficulty Many of the VVSG requirements stipulate that voters must be able to perform certain functions. This does not always provide an unambiguous, "bright-line" test. Herein, we adopt the notion of "significant difficulty". The features provided by the system need not provide an effortless experience - e.g. the voter may have to figure out how to write in a candidate or change a selection - but the feature provided must not be excessively clumsy or complex for the voter.
Use of "Applies to" Clause: The "Applies to" clause of each requirement also governs which test methods are to be executed. E.g. if a requirement applies only to VEBD-A systems, then the corresponding test shall be executed if and only if the VSUT is a VEBD-A system (i.e. has an editable ballot and audio interface).
Serendipitous Detection of Failure: Although each test method is designed for specific requirements, it may also reveal violations of other requirements. These violations are to be noted by the tester and are counted as failures, just as if they had been the explicit purpose of the test.
Abandoning a Test Method: Some of the test methods have later tests that are dependent on earlier parts of the sequence. In general, the test lab should proceed trough as much of the test method as is practical so as to check the system thoroughly. But if a failure early in the test method renders the rest of the scenario meaningless, then it may be abandoned, as long as the reasons are documented.
Order of Tests: Although the tests as presented in this document generally follow the order of requirements in the VVSG, they may be performed in any order.
Requirements and Recommendations: Test methods are specified for both mandatory ("shall") and optional ("should") requirements. The test method defines the conditions under which the system fails the requirement, but of course failure to implement an optional requirement does not prevent a system from conforming.
Documentation of Failure Conditions: When the test lab determines that the system fails a given requirement, it shall document the precise conditions under which failure was detected.
System Deployed as Intended: Unless otherwise stated, the test lab examines and operates the system as deployed according to the instructions of the manufacturer.

Intro.5.2 Rules Specifically for Usability Testing

The following principles apply to all the Usability tests.

Pass/Fail Criteria: Each test method (with a few exceptions) contains one or more pass/fail criteria. These are explicit statements about the conditions under which the system being tested passes or fails. They are marked with a => icon troughout this document. This icon is preceded by a "P" in the case of "pass" conditions, an "F" for "fail" conditions, and "PF" for "pass/fail" conditions.
Since each SRTM applies only to the one requirement under which it is listed, it is implicit that the system is passing or failing that requirement. CRTMs, on the other hand, are used to test several requirements, and so, within a CRTM, the pass/fail criterion will also identify the requirement being passed or failed.
Implicit Passing: Many test methods include a number of steps for each of which the system must perform correctly, or it fails. In general, it is easier to confirm that a system has not met a requirement, than that it has. If the test method is completed successfully without any failures, then the system passes.
Adequacy of Messages to the Voter: There are many requirements in which a "warning" or "notification" or "indication" must be issued to the voter. In general, these do not prescribe when the information is issued (e.g. as a particular vote is attempted, or during a final review) nor the precise format (visual or audio) and content of the warning. Note especially that in the case of manually marked paper ballots, some voter information may be posted within the voting booth, rather than on the ballot itself. The test lab must determine whether the behavior of the system constitutes a conspicuous, specific, and informative message, such as would be adequate for the voter.
Review by Two Experts: When the test method involves expert review of the VSUT, the review is to be carried out by two experts. This is to improve both the thoroughness and objectivity of the test.
Access to CVR: In order to perform some tests, the test lab must have access to the electronic Cast Vote Record (CVR). The VVSG requires that voting systems retain records of individual ballots (see Part 1, section 4.3.2 XREF). The test lab must determine (either from system documentation or from the manufacturer) how to gain such access.
Test Method Dependence on System Class: There are some requirements that, while applying to all voting systems, may be met in various ways, depending on the type of system. In particular, within a single requirement there may be a different test method for VEBD systems than for non-VEBD systems. When this occurs, the scope of each test method will be described explicitly.
Audio Interface: Some tests have to be performed twice, once using the visual interface, and then again using the audio interface (if available). Note that the accessible voting station (class Acc-VS) is a subset of editable systems with audio (class VEBD-A). Therefore any test that applies to VEBD-A systems also applies to Acc-VS systems.
Degree of Parallelism: Many of the CRTMs call for the test lab to enact a voting session, and, during the session, to check certain features of the system for conformance. The features to be checked in parallel normally form a closely related group (e.g. font characteristics or use of color). The idea is to allow the tester to concentrate on one topic at a time. In theory, some of these sessions could be combined, thereby saving testing time. The test lab is free to adopt this approach if desired; but the testers should be aware that they then have to be careful to check all the relevant system characteristics during that one session.
Use of Standard Test Ballot: Unless otherwise stated, the test lab examines and operates the system using a ballot design that implements the NIST standard test ballot specification. The manufacturer is responsible for implementing the specification on the VSUT.

Default Ballot Choices: Unless otherwise stated, when the test involves going trough a voting session and filling out a ballot, the tester shall make the choices described in the following table. Note that these choices represent a completely filled-out ballot (no undervoting).

Contest Choice (Candidate / Party)

Contest #0: Straight Party Vote Option #0.2: Yellow
Contest #1: President and Vice-President of the United States Candidate #1.3: Daniel Court and Amy Blumhardt / Purpley
Contest #2: US Senate Candidate #2.2: Lloyd Garriss / Yellow
Contest #3: US Representative Candidate #3.1: Brad Plunkard / Blue
Contest #4: Governor Candidate #4.30: David Davis / Independent
Contest #5: Lieutenant-Governor Candidate #5.6: Burt Zirkle / Gold
Contest #6: Registrar of Deeds Candidate #6.1: Laila Shamsi / Yellow
Contest #7: State Senator Candidate #7.2: Marty Talarico / Yellow
Contest #8: State Assemblyman Candidate #8.1: Andrea Solis / Blue
Contest #9: County Commissioners Candidate #9.2: Chloe Witherspoon / Blue
Candidate #9.3: Clayton Bainbridge / Blue
Candidate #9.4: Amanda Marracini / Yellow
Candidate #9.7: Sheila Moskowitz / Purple
Write in "Camille Volpe" as the 5th choice
Contest #10: Court of Appeals Judge Candidate #10.1: Michael Marchesani
Contest #11: Water Commissioners Candidate #11.1: Orville White / Blue
Candidate #11.2: Gregory Seldon / Yellow
Contest #12: City Council Candidate #12.2: Randall Rupp / Blue
Candidate #12.3: Carroll Sry / Blue
Candidate #12.4: Beverly Barker / Yellow
Candidate #12.7: Reid Feister / Yellow
Retention Question #1: Yes
Retention Question #2: No
Referendum #1: PROPOSED CONSTITUTIONAL AMENDMENT C No
Referendum #2: PROPOSED CONSTITUTIONAL AMENDMENT D Yes
Referendum #3: PROPOSED CONSTITUTIONAL AMENDMENT H Yes
Referendum #4: PROPOSED CONSTITUTIONAL AMENDMENT K No
Referendum #5: BALLOT MEASURE 101: Open Primaries No
Referendum #6: BALLOT MEASURE 106: Limits on Private Enforcement of Unfair Business Competition Laws No

Contest	Choice (Candidate / Party)
Contest #0: Straight Party Vote	Option #0.2: Yellow
Contest #1: President and Vice-President of the United States	Candidate #1.3: Daniel Court and Amy Blumhardt / Purpley
Contest #2: US Senate	Candidate #2.2: Lloyd Garriss / Yellow
Contest #3: US Representative	Candidate #3.1: Brad Plunkard / Blue
Contest #4: Governor	Candidate #4.30: David Davis / Independent
Contest #5: Lieutenant-Governor	Candidate #5.6: Burt Zirkle / Gold
Contest #6: Registrar of Deeds	Candidate #6.1: Laila Shamsi / Yellow
Contest #7: State Senator	Candidate #7.2: Marty Talarico / Yellow
Contest #8: State Assemblyman	Candidate #8.1: Andrea Solis / Blue
Contest #9: County Commissioners	Candidate #9.2: Chloe Witherspoon / Blue Candidate #9.3: Clayton Bainbridge / Blue Candidate #9.4: Amanda Marracini / Yellow Candidate #9.7: Sheila Moskowitz / Purple Write in "Camille Volpe" as the 5th choice
Contest #10: Court of Appeals Judge	Candidate #10.1: Michael Marchesani
Contest #11: Water Commissioners	Candidate #11.1: Orville White / Blue Candidate #11.2: Gregory Seldon / Yellow
Contest #12: City Council	Candidate #12.2: Randall Rupp / Blue Candidate #12.3: Carroll Sry / Blue Candidate #12.4: Beverly Barker / Yellow Candidate #12.7: Reid Feister / Yellow
Retention Question #1:	Yes
Retention Question #2:	No
Referendum #1: PROPOSED CONSTITUTIONAL AMENDMENT C	No
Referendum #2: PROPOSED CONSTITUTIONAL AMENDMENT D	Yes
Referendum #3: PROPOSED CONSTITUTIONAL AMENDMENT H	Yes
Referendum #4: PROPOSED CONSTITUTIONAL AMENDMENT K	No
Referendum #5: BALLOT MEASURE 101: Open Primaries	No
Referendum #6: BALLOT MEASURE 106: Limits on Private Enforcement of Unfair Business Competition Laws	No

Part 1: Usability and Accessibility Requirements

3 Usability, Accessibility, and Privacy Requirements
3.1 Overview
3.1.1 Purpose
3.1.2 Special Terminology
3.1.3 Interaction of Usability and Accessibility Requirements
3.2 General Usability Requirements
3.2.1 Performance Requirements
3.2.1.1 Overall Performance Metrics
3.2.1.1-A : Total Completion Performance
3.2.1.1-B : Perfect Ballot Performance
3.2.1.1-C : Voter Inclusion Performance
3.2.1.1-D : Usability metrics from the Voting Performance Protocol
3.2.1.1-D.1 : Effectiveness metrics for usability
3.2.1.1-D.2 : Voting session time
3.2.1.1-D.3 : Average voter confidence
3.2.1.2 Manufacturer Testing
3.2.1.2-A : Usability Testing by Manufacturer for General Population
3.2.2 Functional Capabilities
3.2.2-A : Notification of Effect of Overvoting
3.2.2-B : Undervoting to be Permitted
3.2.2-C : Correction of Ballot
3.2.2-D : Notification of Ballot Casting
3.2.2.1 Editable Interfaces
3.2.2.1-A : Prevention of Overvotes
3.2.2.1-B : Warning of Undervotes
3.2.2.1-C : Independent Correction of Ballot
3.2.2.1-D : Ballot Editing per Contest
3.2.2.1-E : Contest Navigation
3.2.2.1-F : Notification of ballot casting failure (DRE)
3.2.2.2 Non-Editable Interfaces
3.2.2.2-A : Notification of Overvoting
3.2.2.2-B : Notification of Undervoting
3.2.2.2-C : Notification of Blank Ballots
3.2.2.2-D : Ballot Correction or Submission Following Notification
3.2.2.2-E : Handling of Marginal Marks
3.2.2.2-F : Notification of ballot casting failure (PCOS)
3.2.3 Privacy
3.2.3.1 Privacy at the Polls
3.2.3.1-A : System Support of Privacy
3.2.3.1-A.1 : Visual Privacy
3.2.3.1-A.2 : Auditory Privacy
3.2.3.1-A.3 : Privacy of Warnings
3.2.3.1-A.4 : No Receipts
3.2.3.2 No Recording of Alternative Format Usage
3.2.3.2-A : No Recording of Alternative Languages
3.2.3.2-B : No Recording of Accessibility Features
3.2.4 Cognitive Issues
3.2.4-A : Completeness of Instructions
3.2.4-B : Availability of Assistance from the System
3.2.4-C : Plain Language
3.2.4-C.1 : Clarity of Warnings
3.2.4-C.2 : Context before Action
3.2.4-C.3 : Simple Vocabulary
3.2.4-C.4 : Start Each Instruction on a New Line
3.2.4-C.5 : Use of Positive
3.2.4-C.6 : Use of Imperative Voice
3.2.4-C.7 : Gender-based Pronouns
3.2.4-D : No Bias among Choices
3.2.4-E : Ballot Design
3.2.4-E.1 : Contests Split among Pages or Columns
3.2.4-E.2 : Indicate Maximum Number of Candidates
3.2.4-E.3 : Consistent Representation of Candidate Selection
3.2.4-E.4 : Placement of Instructions
3.2.4-F : Conventional Use of Color
3.2.4-G : Icons and Language
3.2.5 Perceptual Issues
3.2.5-A : Screen Flicker
3.2.5-B : Resetting of Adjustable Aspects at End of Session
3.2.5-C : Ability to Reset to Default Values
3.2.5-D : Minimum Font Size
3.2.5-E : Available Font Sizes
3.2.5-F : Use of Sans Serif Font
3.2.5-G : Legibility of Paper Ballots and Verification Records
3.2.5-G.1 : Legibility via Font Size
3.2.5-G.2 : Legibility via Magnification
3.2.5-H : Contrast Ratio
3.2.5-I : High Contrast for Electronic Displays
3.2.5-J : Accommodation for Color Blindness
3.2.5-K : No Reliance Solely on Color
3.2.6 Interaction Issues
3.2.6-A : No Page Scrolling
3.2.6-B : Unambiguous Feedback for Voter's Selection
3.2.6-C : Accidental Activation
3.2.6-C.1 : Size and Separation of Touch Areas
3.2.6-C.2 : No Repeating Keys
3.2.6.1 Timing Issues
3.2.6.1-A : Maximum Initial System Response Time
3.2.6.1-B : Maximum Completed System Response Time for Vote Confirmation
3.2.6.1-C : Maximum Completed System Response Time for All Operations
3.2.6.1-D : System Response Indicator
3.2.6.1-E : Voter Inactivity Time
3.2.6.1-F : Alert Time
3.2.7 Alternative Languages
3.2.7-A : General Support for Alternative Languages
3.2.7-A.1 : Voter Control of Language
3.2.7-A.2 : Complete Information in Alternative Language
3.2.7-A.3 : Auditability of Records for English Readers
3.2.7-A.4 : Usability Testing by Manufacturer for Alternative Languages
3.2.8 Usability for Poll Workers
3.2.8-A : Clarity of System Messages for Poll Workers
3.2.8.1 Operation
3.2.8.1-A : Ease of Normal Operation
3.2.8.1-B : Usability Testing by Manufacturer for Poll Workers
3.2.8.1-C : Documentation usability
3.2.8.1-C.1 : Poll Workers as target audience
3.2.8.1-C.2 : Usability at the polling place
3.2.8.1-C.3 : Enabling verification of correct operation
3.2.8.2 Safety
3.2.8.2-A : Safety Certification
3.3 Accessibility Requirements
3.3.1 General
3.3.1-A : Accessibility troughout the Voting Session
3.3.1-A.1 : Documentation of Accessibility Procedures
3.3.1-B : Complete Information in Alternative Formats
3.3.1-C : No Dependence on Personal Assistive Technology
3.3.1-D : Secondary Means of Voter Identification
3.3.1-E : Accessibility of Paper-based Vote Verification
3.3.1-E.1 : Audio Readback for paper-based Vote Verification
3.3.2 Low Vision
3.3.2-A : Usability Testing by Manufacturer for Voters with Low Vision
3.3.2-B : Adjustable Saturation for Color Displays
3.3.2-C : Distinctive Buttons and Controls
3.3.2-D : Syncronized Audio and Video
3.3.3 Blindness
3.3.3-A : Usability Testing by Manufacturer for Blind Voters
3.3.3-B : Audio-Tactile Interface
3.3.3-B.1 : Equivalent Functionality of ATI
3.3.3-B.2 : ATI Supports Repetition
3.3.3-B.3 : ATI Supports Pause and Resume
3.3.3-B.4 : ATI Supports Transition to Next or Previous Contest
3.3.3-B.5 : ATI Can Skip Referendum Wording
3.3.3-C : Audio Features and Characteristics
3.3.3-C.1 : Standard Connector
3.3.3-C.2 : T-coil Coupling
3.3.3-C.3 : Sanitized Headphone or Handset
3.3.3-C.4 : Initial Volume
3.3.3-C.5 : Range of Volume
3.3.3-C.6 : Range of Frequency
3.3.3-C.7 : Intelligible Audio
3.3.3-C.8 : Control of Speed
3.3.3-D : Ballot Activation
3.3.3-E : Ballot Submission and Vote Verification
3.3.3-F : Tactile Discernability of Controls
3.3.3-G : Discernability of Key Status
3.3.4 Dexterity
3.3.4-A : Usability Testing by Manufacturer for Voters with Dexterity Disabilities
3.3.4-B : Support for Non-Manual Input
3.3.4-C : Ballot Submission and Vote Verification
3.3.4-D : Manipulability of Controls
3.3.4-E : No Dependence on Direct Bodily Contact
3.3.5 Mobility
3.3.5-A : Clear Floor Space
3.3.5-B : Allowance for Assistant
3.3.5-C : Visibility of Displays and Controls
3.3.5.1 Controls within Reach
3.3.5.1-A : Forward Approach, No Obstruction
3.3.5.1-B : Forward Approach, with Obstruction
3.3.5.1-B.1 : Maximum Size of Obstruction
3.3.5.1-B.2 : Maximum High Reach over Obstruction
3.3.5.1-B.3 : Toe Clearance under Obstruction
3.3.5.1-B.4 : Knee Clearance under Obstruction
3.3.5.1-C : Parallel Approach, No Obstruction
3.3.5.1-D : Parallel Approach, with Obstruction
3.3.5.1-D.1 : Maximum Size of Obstruction
3.3.5.1-D.2 : Maximum High Reach over Obstruction
3.3.6 Hearing
3.3.6-A : Reference to Audio Requirements
3.3.6-B : Visual Redundancy for Sound Cues
3.3.6-C : No Electromagnetic Interference with Hearing Devices
3.3.7 Cognition
3.3.7-A : General Support for Cognitive Disabilities
3.3.8 English Proficiency
3.3.8-A : Use of ATI
3.3.9 Speech
3.3.9-A : Speech not to be Required by Equipment

3.2 General Usability Requirements

3.2.1 Performance Requirements

3.2.1.1 Overall Performance Metrics

3.2.1.2 Manufacturer Testing

3.2.1.2-A Usability Testing by Manufacturer for General Population

Test Method: Usability Testing by Manufacturer

3.2.2 Functional Capabilities

3.2.2-A Notification of Effect of Overvoting

Test Method: If the system is a VEBD type, this requirement is covered under XREF 3.2.2.1-A Prevention of Overvotes. If the system is a PCOS type, this requirement is covered under XREF 3.2.2.2-A Notification of Overvoting.

If the system is one with a MMPB and no immediate feedback to the voter (such as with central count systems), the tester shall inspect the system and verify that notification is readily available to the voter. For example, this may be achieved by posting the notification within a voting booth or stall, or by including the notification directly on the paper ballot.

For types of systems other than those mentioned above, the tester shall verify that notification is given in a way that is appropriate for the system.

PF => If adequate notification on the effect of overvoting is readily available to the voter, then the system passes, otherwise it fails.

3.2.2-B Undervoting to be Permitted

Test Method: The tester shall fill out the ballot using the default ballot choices, except that 1) no party is chosen in race #0 (straight party vote), 2) no candidate is chosen in race #5 (Lieutenant-Governor), and 3) in race #9 (county commissioners), only Candidate #9.3 (Clayton Bainbridge / Blue) and Candidate #9.4 (Amanda Marracini / Yellow) are chosen. The tester shall then attempt to cast the ballot. The system must allow this to be submitted as a valid ballot (whether or not a warning is issued).

PF => If the system accepts the undervoted ballot, then the system passes, otherwise it fails.

3.2.2-C Correction of Ballot

Test Method: For VEBD systems, use the test method described below under XREF 3.2.2.1-C "Independent Correction of Ballot".

For non-VEBD systems, the tester shall verify that instructions on how to correct a ballot are readily available to the voter. For example, this may be achieved by posting the instructions within a voting booth or stall, or by including the instructions directly on the ballot.

PF => If instructions for correcting the ballot are readily available to the voter, then the system passes, otherwise it fails.

Since the actual correction of a non-VEBD ballot typically depends on procedures extraneous to the actual equipment (such as getting a new paper ballot from a poll worker), the correction process itself is not tested.

3.2.2-D Notification of Ballot Casting

Test Method: Editable Ballot Session , Non-Editable Ballot Session

3.2.2.1 Editable Interfaces

3.2.2.1-A Prevention of Overvotes

Test Method: Editable Ballot Session

3.2.2.1-B Warning of Undervotes

Test Method: Editable Ballot Session

3.2.2.1-C Independent Correction of Ballot

Test Method: Editable Ballot Session

3.2.2.1-D Ballot Editing per Contest

Test Method: Editable Ballot Session

3.2.2.1-E Contest Navigation

Test Method: Editable Ballot Session

3.2.2.1-F Notification of ballot casting failure (DRE)

Test Method: Since this requirement takes effect only in the case of equipment failure, it cannot be tested by deliberately setting up the precondition, as with other requirements. Rather the requirement is to be tested opportunistically: if, during any of the other testing procedures (whether or not usability-related), equipment failure for ballot casting is detected, the tester must determine a) whether or not the current ballot was recorded and b) whether an adequate notification to the voter was issued.

PF => If a correct and adequate notification is issued, then the system passes, otherwise it fails.

3.2.2.2 Non-Editable Interfaces

3.2.2.2-A Notification of Overvoting

Test Method: Non-Editable Ballot Session

3.2.2.2-B Notification of Undervoting

Test Method: Non-Editable Ballot Session

3.2.2.2-C Notification of Blank Ballots

Test Method: The tester shall first enable the system for warning about blank ballots. He shall then submit paper ballots with the following characteristics (if the system does not accept two-sided ballots, then skip those cases):

Ballot Correct Result

Two-sided ballot, completely blank Warning
Two-sided ballot, with a single vote on each side No Warning
Two-sided ballot, blank front side, single vote on back Warning
Two-sided ballot, single vote on front, blank back side Warning
One-sided ballot, blank Warning
One-sided ballot, with a single vote No Warning

Ballot	Correct Result
Two-sided ballot, completely blank	Warning
Two-sided ballot, with a single vote on each side	No Warning
Two-sided ballot, blank front side, single vote on back	Warning
Two-sided ballot, single vote on front, blank back side	Warning
One-sided ballot, blank	Warning
One-sided ballot, with a single vote	No Warning

F => If the result in any of these cases is incorrect, then the system fails.

Next, the tester shall disable the system for warning about blank ballots. He shall then re-submit the test ballots, as above.

PF => If the system accepts all the ballots without warning, then the system passes, otherwise it fails.

3.2.2.2-D Ballot Correction or Submission Following Notification

Test Method: Non-Editable Ballot Session

3.2.2.2-E Handling of Marginal Marks

Test Method: The tester shall fill out the ballot using the default ballot choices, except that in contest #1, vote for ticket #1.4 (Boone and Lian) for President/VP and also make a marginal mark (as per manufacturer specifications) for ticket #1.7 (Harp and Gray). In contest #2, vote for none, but make a marginal mark for candidate #2.4 (Hewetson). When the ballot is submitted, the system must detect, identify, and warn about both marginal marks.

F => If either marginal mark is not detected, identified and warned about, then the system fails.

There may also be a warning about overvoting in contest #1 or undervoting in contest #2, but this is not mandatory.

The tester should then fix the marginal mark in contest #2 so as to make it a valid vote, but leave contest #1 as is, and then re-submit the ballot. Again, the system must detect, identify, and warn about the remaining marginal mark in contest #1.

F => If the remaining marginal mark is not detected, identified and warned about, then the system fails.

F => If the system warns about any marginal mark other than as specified above, then the system fails.

3.2.2.2-F Notification of ballot casting failure (PCOS)

Test Method: Since this requirement takes effect only in the case of equipment failure, it cannot be tested by deliberately setting up the precondition, as with other requirements. Rather the requirement is to be tested opportunistically: if, during any of the other testing procedures (whether or not usability-related), equipment failure for ballot casting is detected (including failure to read the ballot or to transport it into the ballot box), the tester must determine a) whether or not the current ballot was recorded and b) whether an adequate notification to the voter was issued.

PF => If a correct and adequate notification is issued, then the system passes, otherwise it fails.

3.2.3 Privacy

3.2.3.1 Privacy at the Polls

3.2.3.1-A System Support of Privacy

Test Method: Privacy of Voting Session

3.2.3.1-A.1 Visual Privacy

Test Method: Privacy of Voting Session

3.2.3.1-A.2 Auditory Privacy

Test Method: Privacy of Voting Session

3.2.3.1-A.3 Privacy of Warnings

Test Method: Privacy of Voting Session

3.2.3.1-A.4 No Receipts

Test Method: Privacy of Voting Session

3.2.3.2 No Recording of Alternative Format Usage

3.2.3.2-A No Recording of Alternative Languages

Test Method: Privacy of Cast Vote Record (CVR)

3.2.3.2-B No Recording of Accessibility Features

Test Method: Privacy of Cast Vote Record (CVR)

3.2.4 Cognitive Issues

3.2.4-A Completeness of Instructions

Test Method: The tester shall proceed trough an entire voting session, fill out the ballot using the default ballot choices, and check for the presence of instructions for all the functions supported by the system, especially including:

System activation and/or session initiation (e.g. use of an activation card).
Adjustment of visual display characteristics (e.g. font size, color, contrast)
Adjustment of audio characteristics (e.g. volume, speed)
Use of other auxiliary devices, such as a magnifier for paper records
Mechanism for non-manual input
Navigating back and forth trough multiple pages
Changing a vote
Writing in a candidate for office
Review of the ballot
Final casting of the ballot

Not all the above functions are mandatory - but if present, the system must explain how they are to be used. The tester should attempt to discover and exercise any and all such functions provided by the system. Note that the system must provide instructions for all its operations, even if some of those are beyond what is mandated by the VVSG.

PF => If adequate instructions are available for all voter operations, then the system passes, otherwise it fails.

3.2.4-B Availability of Assistance from the System

Test Method: For VEBD systems, the tester shall proceed trough an entire voting session, using the editable ballot session. Confirm that help is available from the system at these points within the session:

prior to voting for any of the candidate,
immediately after voting for Governor
when viewing the session review screen (if any)
just before final casting of the ballot

If the system under test is an Acc-VS, the above test method must be enacted for both the visual and audio interface.

PF => If assistance is available at all the points designated above, then the system passes, otherwise it fails.

For non-VEBD systems, the tester need not enact a voting session. Rather, confirm that written instructions or some other built-in mechanism would be readily available to the voter troughout the voting session. Possibilities for presenting assistance include a poster, information on the ballot itself, or an independent electronic "help" system.

PF => If assistance is readily available to the voter at any time during the voting session, then the system passes, otherwise it fails.

3.2.4-C Plain Language

Test Method: Language Clarity

3.2.4-C.1 Clarity of Warnings

Test Method: Language Clarity

3.2.4-C.2 Context before Action

Test Method: Language Clarity

3.2.4-C.3 Simple Vocabulary

Test Method: Language Clarity

3.2.4-C.4 Start Each Instruction on a New Line

Test Method: Language Clarity

3.2.4-C.5 Use of Positive

Test Method: Language Clarity

3.2.4-C.6 Use of Imperative Voice

Test Method: Language Clarity

3.2.4-C.7 Gender-based Pronouns

Test Method: Language Clarity

3.2.4-D No Bias among Choices

Test Method: The tester shall inspect the ballot and confirm that all candidates and other ballot choices are presented in a fair and equivalent manner. Characteristics such as font size or voice volume and speed must be the same for all choices.

For VEBD systems, use the editable ballot session below. If the system under test is an Acc-VS, the test method must be enacted for both the visual and audio interface.

For non-VEBD systems, the tester need not enact a voting session. Rather, confirm that all choices are presented in an equivalent manner with respect to visual appearance, font size, layout, and the like.

PF => If all choices are presented without bias, then the system passes, otherwise it fails.

3.2.4-E Ballot Design

Test Method: Ballot Design

3.2.4-E.1 Contests Split among Pages or Columns

Test Method: Ballot Design

3.2.4-E.2 Indicate Maximum Number of Candidates

Test Method: Ballot Design

3.2.4-E.3 Consistent Representation of Candidate Selection

Test Method: Ballot Design

3.2.4-E.4 Placement of Instructions

Test Method: Ballot Design

3.2.4-F Conventional Use of Color

Test Method: Ballot Design

3.2.4-G Icons and Language

Test Method: Ballot Design

3.2.5 Perceptual Issues

3.2.5-A Screen Flicker

Test Method: The tester shall proceed trough the voting session until the first contest (straight party) is displayed. If there is a blinking visual element on this page, proceed trough the session until a page without a blinking element is displayed. The measurement is to be taken in a dark room environment.

The tester shall use a photometer with an oscilloscope attached to the photometer's output. The flicker rate is then measured from the oscilloscope�s waveform display. Equipment shall have an accuracy of at least ± 1 cd/m2 for light measurements and at least 1kHz bandwidth with a 1 second sweep range for time base measurements.

Since the flicker rate is expected to be constant, meters capable of frequency and duty cycle measurements can also be used for this test. If so, the equipment shall have an accuracy of at least ± 0.1 Hz for frequency measurements, and at least ± 2% for duty cycle measurements.

FP => If the measured flicker rate is within the 2-55 Hz range, then the system fails, otherwise it passes.

3.2.5-B Resetting of Adjustable Aspects at End of Session

Test Method: Default Characteristics

3.2.5-C Ability to Reset to Default Values

Test Method: Default Characteristics

3.2.5-D Minimum Font Size

Test Method: Font Characteristics

3.2.5-E Available Font Sizes

Test Method: The tester shall select a font size between 3mm and 4mm and then proceed trough the voting session, using the default ballot choices.

F => If no such font size is available for selection, then the system fails.

On each page, the tester shall measure (using a 15x magnifier) the height of capital letters in the smallest text intended for the voter.

F => If any of these letters has a height less than 3.0mm or greater than 4.0mm, then the system fails.

After voting for US Representative (Contest #3), the tester shall then select a larger font size between 6.3 and 9.0mm.

F => If the font size cannot be changed at this point, then the system fails.

F => If the larger font size is not available for selection, then the system fails.

The tester shall proceed trough contest #6, again measuring the height of capital letters in the smallest text intended for the voter.

F => If any of these letters has a height less than 6.3mm or greater than 9.0mm, then the system fails.

The tester shall then navigate back to the first t contests and verify that they are being displayed with the larger font size and that the original ballot choices were preserved.

F => If the first t contests are not shown in the larger font size, then the system fails.

F => If the original ballot choices were not preserved, then the system fails.

After voting for Registrar of Deeds (Contest #6), the tester shall then re-select a font size between 3.0mm and 4.0mm by means of the universal reset mechanism specified in 3.2.5-C XREF. The tester shall vote trough contest #9, and repeat the above process, verifying the text is of the appropriate size and that earlier ballot choices have been preserved.

F => If the font size cannot be changed at this point, then the system fails.

F => If the original font size is not available for selection, then the system fails.

F => If any of the capital letters intended for the voter has a height less than 3.0mm or greater than 4.0mm, then the system fails.

F => If the original ballot choices were not preserved, then the system fails.

Following contest #9, if there are available font sizes between the two guaranteed by the VVSG, the tester shall select one of these and verify that the display agrees with the selected size. This intermediate size is optional, and so this part of the test is enacted only if such a size is available.

F => If the ballot pages are not consistently displayed in the intermediate font size, then the system fails.

F => If the original ballot choices were not preserved, then the system fails.

3.2.5-F Use of Sans Serif Font

Test Method: Font Characteristics

3.2.5-G Legibility of Paper Ballots and Verification Records

Test Method: Note that this requirement and its sub-requirements apply to all the various types of paper records that would normally be available to the voter. This includes the ballot itself, as well as a verification record, as in the case of VVPAT systems.

If the system attempts to achieve legibility via font size or magnification, use the test method for the corresponding sub-requirement below.

If the system uses some other means to achieve legibility, then a tester with expertise in visual usability shall proceed trough an entire voting session, fill out the ballot using the default ballot choices, examine the paper records used by the system, and determine whether the system incorporates features such that a voter with poor reading vision (20/70 farsighted vision) would be able to vote successfully. 20/70 farsighted is defined as the ability to read characters subtending an arc of 17.5 minutes at a distance of 40 cm. Such characters have a height of at least 2mm.

PF => If a poor-vision voter would be able to read the paper records successfully, then the system passes, otherwise it fails.

3.2.5-G.1 Legibility via Font Size

Test Method: This test applies if the system has chosen to meet the legibility requirement via font size. The tester shall proceed trough an entire voting session, fill out the ballot using the default ballot choices so as to cause the system to present all the various types of paper records that would normally be available to the voter.

The tester shall select a font size between 3.0 and 4.0 mm for paper records.

F => If no such font size is available for selection, then the system fails.

On each page of all the paper records, the tester shall measure (using a 15x magnifier) the height of capital letters in the smallest text intended for the voter.

F => If any of these letters has a height less than 3.0mm or greater than 4.0mm, then the system fails.

The tester shall then repeat the above process, except for selecting a font size between 6.3mm and 9.0mm. This may require a separate voting session, as there is no requirement to allow voters to switch font size for paper within a session.

F => If no such font size is available for selection, then the system fails.

On each page of all the paper records, the tester shall measure (using a 15x magnifier) the height of capital letters in the smallest text intended for the voter.

F => If any of these letters has a height less than 6.3mm or greater than 9.0mm, then the system fails.

3.2.5-G.2 Legibility via Magnification

Test Method: This test applies if the system has chosen to meet the legibility requirement via magnification. The tester shall proceed trough an entire voting session, fill out the ballot using the default ballot choices, and, as instructed by the system, use the magnification mechanism to view all the paper records.

F => If there are paper records presented to the voter for which the magnifier is not available, then the system fails.

The tester shall view the paper records as magnified, and determine whether the records would be readily legible to a voter with 20/70 farsighted vision. 20/70 farsighted is defined as the ability to read characters subtending an arc of 17.5 minutes at a distance of 40 cm. Such characters have a height of at least 2mm.

PF => If a voter with 20/70 vision would be able to read the paper records successfully, then the system passes, otherwise it fails.

3.2.5-H Contrast Ratio

Test Method: First, the tester must select samples for contrast testing. It is impractical to measure contrast ratios for all of the visual material intended for voters and poll workers.

Material intended for voters includes:

Instructions (built-in or external) on the use of the system for voting
The actual ballot or ballot interface
Verfication records

Material intended for poll workers includes:

Instructions on the operation of the system
Any labels or instructions affixed to the system itself

The tester should select at least one example from each available type of material. Since the purpose of the test is to assure adequate contrast, the tester should look for examples of potentially low contrast, such as light-colored icons or text on a white background, or dark icons or text on a deeply colored background.

It is very difficult, with current technology, to measure the luminance, and hence contrast, of small areas (1-2 pixels wide). Therefore, in the examples chosen for inspection, both the lighter and darker area must be at least 1/2 inch in height and width.

Note that the content may be presented on an electronic screen or on a "passive" medium that is to be viewed via ambient light. After selecting examples to be measured, the tester shall use the appropriate procedure described below, depending on the medium of presentation.

If the medium is passive (such as paper or plastic labels), the tester shall measure the luminance of the foreground item and of the adjacent background, using a spot photometer. The sensitivity of the photometer shall be set so as to simulate an environment with a diffuse ambient light level of 500 lx.

F => If the higher luminance of these two measurements is less than t times the lower luminance, then the system fails.

If the medium is an electronic screen, the procedure for measuring the ambient contrast ratio is described in Section 308-2 of the VESA Flat Panel Display Measurements standard (FPDM) Version 2. The referenced standard specifies the required test equipment, test setup, and test procedures. The diffuse ambient light level for this test shall be 500 lx.

F => If the measured contrast ratio is less than 3:1, then the system fails.

3.2.5-I High Contrast for Electronic Displays

Test Method:

The tester shall proceed trough a voting session using the default ballot choices. Within the session, the tester shall select the high contrast option, either explicitly or by default. Next, the tester must select samples for contrast testing. It is impractical to measure contrast ratios for all of the screens intended for voters. Since the purpose of the test is to assure adequate contrast, the tester should look for examples of potentially low contrast, such as light-colored icons or text on a white background, or dark icons or text on a deeply colored background.

The procedure for measuring the ambient contrast ratio is described in Section 308-2 of the VESA Flat Panel Display Measurements standard (FPDM) Version 2. The referenced standard specifies the required test equipment, test setup, and test procedures. The diffuse ambient light level for this test shall be 500 lx.

PF => If the measured contrast ratio is at least 6:1, then the system passes, otherwise it fails.

3.2.5-J Accommodation for Color Blindness

Test Method: Use of Color

3.2.5-K No Reliance Solely on Color

Test Method: Use of Color

3.2.6 Interaction Issues

3.2.6-A No Page Scrolling

Test Method: Scrolling and Feedback

3.2.6-B Unambiguous Feedback for Voter's Selection

Test Method: Scrolling and Feedback

3.2.6-C Accidental Activation

Test Method: Accidental Activation

3.2.6-C.1 Size and Separation of Touch Areas

Test Method: Accidental Activation

3.2.6-C.2 No Repeating Keys

Test Method: Accidental Activation

3.2.6.1 Timing Issues

3.2.6.1-A Maximum Initial System Response Time

Test Method: Response Time

3.2.6.1-B Maximum Completed System Response Time for Vote Confirmation

Test Method: Response Time

3.2.6.1-C Maximum Completed System Response Time for All Operations

Test Method: Response Time

3.2.6.1-D System Response Indicator

Test Method: Response Time

3.2.6.1-E Voter Inactivity Time

Test Method: Inactivity Time

3.2.6.1-F Alert Time

Test Method: Inactivity Time

3.2.7 Alternative Languages

3.2.7-A General Support for Alternative Languages

Test Method: Alternative Languages

3.2.7-A.1 Voter Control of Language

Test Method: Alternative Languages

3.2.7-A.2 Complete Information in Alternative Language

Test Method: Alternative Languages

3.2.7-A.3 Auditability of Records for English Readers

Test Method: Alternative Languages

3.2.7-A.4 Usability Testing by Manufacturer for Alternative Languages

Test Method: Usability Testing by Manufacturer

3.2.8 Usability for Poll Workers

3.2.8-A Clarity of System Messages for Poll Workers

Test Method: Operational Usability for Poll Workers

3.2.8.1 Operation

3.2.8.1-A Ease of Normal Operation

Test Method: Operational Usability for Poll Workers

3.2.8.1-B Usability Testing by Manufacturer for Poll Workers

Test Method: Usability Testing by Manufacturer

3.2.8.1-C Documentation usability

Test Method: Operational Usability for Poll Workers

3.2.8.1-C.1 Poll Workers as target audience

Test Method: Operational Usability for Poll Workers

3.2.8.1-C.2 Usability at the polling place

Test Method: Operational Usability for Poll Workers

3.2.8.1-C.3 Enabling verification of correct operation

Test Method: Operational Usability for Poll Workers

3.2.8.2 Safety

3.2.8.2-A Safety Certification

Test Method: The tester shall verify that the system has been certified in accordance with the requirements of UL 60950, Safety of Information Technology Equipment, by a duly authorized safety testing laboratory.

FP => If such certification cannot be verified, then the system fails, otherwise it passes.

Note that the tester is not expected to perform the safety checks directly, but rather to verify that the system has been certified by a safety lab.

3.3 Accessibility Requirements

3.3.1 General

3.3.1-A Accessibility troughout the Voting Session

Test Method: End-to-end Accessibility

3.3.1-A.1 Documentation of Accessibility Procedures

Test Method: End-to-end Accessibility

3.3.1-B Complete Information in Alternative Formats

Test Method: This general requirement is tested specifically under sec. 3.3.3-B XREF, "Audio-Tactile Interface".

3.3.1-C No Dependence on Personal Assistive Technology

Test Method: End-to-end Accessibility

3.3.1-D Secondary Means of Voter Identification

Test Method: The tester shall first determine whether the system uses biometric characteristics for voter identification or authentication, such as an electronic poll book that uses fingerprints.

P => If biometric measures are not used for voter identification, then the system passes.

If biometric measures are used, the tester shall review the documentation of the system to verify that an alternative means is available (such as presentation of identity documentation or another biometric mode).

PF => If an alternative means of identification is available, then the system passes, otherwise it fails.

3.3.1-E Accessibility of Paper-based Vote Verification

Test Method: Accessible Ballot Verification and Submission

3.3.1-E.1 Audio Readback for paper-based Vote Verification

Test Method: Accessible Ballot Verification and Submission

3.3.2 Low Vision

3.3.2-A Usability Testing by Manufacturer for Voters with Low Vision

Test Method: Usability Testing by Manufacturer

3.3.2-B Adjustable Saturation for Color Displays

Test Method: Partial Vision

3.3.2-C Distinctive Buttons and Controls

Test Method: Partial Vision

3.3.2-D Syncronized Audio and Video

Test Method: The tester shall proceed trough the voting session using the default ballot choices, except as noted below.

The tester shall first select video-only mode (no audio), vote in contests #1 and #2 and proceed to contest #3. In that contest he/she shall vote for a write-in candidate, "Vicki Video".

F => If video-only mode is unavailable, then the system fails.

F => If video-only mode does not present the ballot visually, while suppressing audio output, then the system fails.

The tester shall then switch to audio-only mode (no video) and proceed trough contests #4 and #5 to contest #6 in which he/she shall vote for a write-in candidate, "Andy Audio".

F => If audio-only mode is unavailable, then the system fails.

F => If audio-only mode does not present the ballot aurally, while suppressing visual output, then the system fails.

The tester shall then switch to full audio-visual mode, and verify that the ballot choices for the first six contests have been preserved.

F => If audio-visual mode is unavailable, then the system fails.

F => If audio-visual mode does not present the ballot both visually and aurally, then the system fails.

F => If switching among modes has caused the ballot choices to be lost or altered, then the system fails.

The tester shall then fill out the remainder of the ballot. Troughout the session, there must be a reasonable correspondence between the visual and auditory presentation of the ballot. In particular, when there is a detectable voter action (such as selecting a candidate, advancing to the next page, or typing in a write-in choice) both visual and auditory presentations must respond accordingly. The tester must allow for the fact that a large amount of visual information can be presented "all at once" on a page, whereas auditory information is necessarily presented in a temporal sequence.

FP => If there is a significant lack of correspondence between the visual and auditory information presented, then the system fails, otherwise it passes.

3.3.3 Blindness

3.3.3-A Usability Testing by Manufacturer for Blind Voters

Test Method: Usability Testing by Manufacturer

3.3.3-B Audio-Tactile Interface

Test Method: Audio-Tactile Interface

3.3.3-B.1 Equivalent Functionality of ATI

Test Method: Audio-Tactile Interface

3.3.3-B.2 ATI Supports Repetition

Test Method: Audio-Tactile Interface

3.3.3-B.3 ATI Supports Pause and Resume

Test Method: Audio-Tactile Interface

3.3.3-B.4 ATI Supports Transition to Next or Previous Contest

Test Method: Audio-Tactile Interface

3.3.3-B.5 ATI Can Skip Referendum Wording

Test Method: Audio-Tactile Interface

3.3.3-C Audio Features and Characteristics

Test Method: This general requirement is tested specifically under its sub-requirements.

3.3.3-C.1 Standard Connector

Test Method: The tester shall connect a headphone that has a 3.5mm stereo plug and verify that the audio presentation of the ballot is clearly audible trough the headphones.

FP => If no such jack is available or if there is not a clear audio signal trough the headphones, then the system fails, otherwise it passes.

3.3.3-C.2 T-coil Coupling

Test Method: The test methods to be used are fully documented in the ANSI standard as cited in the requirement.

F => If a wireless T-Coil coupling is not provided, then the system fails.

PF => If the wireless T-Coil coupling meets the test criteria for category T4, then the system passes, otherwise it fails.

3.3.3-C.3 Sanitized Headphone or Handset

Test Method: The tester shall inspect the method used by the system to provide a headphone or handset to the voter. Sanitation can be achieved in various ways, including the use of "trowaway" headphones, or of sanitary coverings.

F => If no audio device is provided, then the system fails.

PF => If there are adequate provisions for sanitization, then the system passes, otherwise it fails.

3.3.3-C.4 Initial Volume

Test Method: Audio Volume

3.3.3-C.5 Range of Volume

Test Method: Audio Volume

3.3.3-C.6 Range of Frequency

Test Method: For this test, the tester needs to control the input signal to the audio equipment, rather than using the normal audio signal generated by the test ballot.

Frequency range is measured in one of two ways, depending on whether the audio information is presented trough open air or trough headphones or a handset. For both modes:

The input test signal shall be a pink noise with a flatness of at least ± 0.5 dB for all third octave bands from 100Hz to 10KHz. The output level should be 80 dB SPL ± 5 dB as measured with a broadband instrument using an A-weighting filter. This output level should ensure that the audio circuit�s peak capacity isn�t reached and therefore won�t influence the frequencies of interest.

The frequency spectrum shall be measured as described in IEEE 269 for all third octave bands from 100Hz trough 10KHz and the measured spectrum shall comply within the tolerances of the floating mask requirement.

If the audio output falls below XXdB SPL for any frequency between 315Hz and 10KHz, "Range of Frequency" fails.

Measurement of Open Air Frequency

The test for measuring the audio frequency spectrum is described in IEEE 269 . Open air sound levels should be measured in anechoic conditions to prevent reflections from affecting the measurement accuracy.

If the voting system is designed for operation when both sitting and standing, then measurements shall be taken for both operating positions and for the 5th percentile female and 95th percentile male.

Measurement of Headphone/Handset Frequency

The test is described in IEEE 269. The referenced standard specifies the required test equipment, test setup, and test procedures.

Follow the test methodology that is relevant to receiving audio trough a private audio output device applicable to the voting system under test. "Headphones" equates to the term "headsets" used in Clause 9 of the referenced standard.

For a HATS (Head and Torso Simulator), Type 3.3 ears shall be used as defined in Clause 5 of the referenced standard.

If the ERP (Ear Reference Point) is not specified by the manufacturer of the private audio output device, then the defaults in the referenced standard shall be used.

3.3.3-C.7 Intelligible Audio

Test Method: The tester shall proceed trough the entire voting session using the default ballot choices, and evaluate the intelligibility of the audio information presented, including the pronunciation of candidate names, instructions, and warnings, the use of normal intonation, appropriate rate of speech, and acceptably low background noise.

FP => If the tester judges that significant information would be unintelligible to the voter, then the system fails, otherwise it passes.

The loss of small amounts of non-critical information should be noted, but is not by itself a basis for failure.

3.3.3-C.8 Control of Speed

Test Method: The tester shall proceed trough the voting session to contest #5 (Lt-Governor) and measure the amount of time taken to announce all the candidates, using the default speech rate. The tester shall set the speech rate to its minimum.

F => If there is no mechanism for adjusting speech rate, then the system fails.

The tester shall then skip back to the beginning of contest #5 and again measure the time taken to announce all the candidates.

F => If this second time is less than 4/3 of the original time, then the system fails.

The tester shall then set the speech rate to its maximum and then skip back to the beginning of contest #5 and again measure the time taken to announce all the candidates.

F => If this third time is greater than 1/2 of the original time, then the system fails.

3.3.3-D Ballot Activation

Test Method: The tester shall determine if the system supports ballot activation for non-blind voters.

P => If ballot activation by the voter is not supported, then the system passes.

If the voting station does support ballot activation for non-blind voters, the tester shall proceed trough the process of ballot activation, using the features provided for blind voters and verify that these features constitute a viable mechanism for such voters.

FP => If blind voters would encounter significant difficulty in activating the ballot, then the system fails, otherwise it passes.

3.3.3-E Ballot Submission and Vote Verification

Test Method: Accessible Ballot Verification and Submission

3.3.3-F Tactile Discernability of Controls

Test Method: The tester shall proceed trough the voting session using the default ballot choices. During the session, the tester shall examine all of the system's buttons, controls, and keys intended for use by the voter and verify that they are distinguishable by shape or texture. Note that not every individual key within a keypad or keyboard need by distinguishable by touch alone (see requirement 3.3.2-C XREF); it is sufficient if certain "home" keys (such as the "5" in the middle of the keypad) are tactilely distinctive. This allows the user to navigate to nearby keys via their position.

F => If there are controls or keys which cannot be distinguished by shape or texture, then the system fails.

The tester shall further verify that the controls and keys are sufficiently insensitive that one can easily touch them lightly so as to determine the distinguishing characteristic and yet not activate the control.

FP => If the tester has significant difficulty in tactilely discerning a key or control without also activating it, then the system fails, otherwise it passes.

3.3.3-G Discernability of Key Status

Test Method: The tester shall proceed trough the voting session using the default ballot choices. During the session, the tester shall examine the system for the presence of locking or toggle controls or keys intended for use by the voter. These are keys (such as "caps lock") that modify the effect of all subsequent input until explicitly reversed.

P => If there are no such controls or keys, then the system passes.

If there are such controls or keys, the tester shall verify that the status of each is visually discernible. For instance, on many keyboards, there is a small LED, either directly on the "caps lock" key or elsewhere on the keyboard, that is lit if and only if "caps lock" is activated.

F => If the status of any such control or key is not visually discernible, then the system fails.

The tester shall then activate the locking or toggle function of each such key and verify that the state of the key is discernible either trough touch (e.g. a key in a depressed or raised position, or a toggle switch positioned to the left or right) or trough some audible feedback (e.g. verbal feedback such as "shift"/"unshift" or via a distinctive tone).

FP => If the state of the key is discernible trough neither touch nor sound, then the system fails, otherwise it passes.

3.3.4 Dexterity

3.3.4-A Usability Testing by Manufacturer for Voters with Dexterity Disabilities

Test Method: Usability Testing by Manufacturer

3.3.4-B Support for Non-Manual Input

Test Method: Non-Manual Operation

3.3.4-C Ballot Submission and Vote Verification

Test Method: Non-Manual Operation

3.3.4-D Manipulability of Controls

F => If any operation in the session requires tight grasping, pinching, or twisting of the wrist, then the system fails.

The tester shall also measure the activation force required by the controls. The test for measuring force is to use a linear force gauge with a peak indicator (manual or electronic) on the actual controls. This tool can be used for measuring push and/or pull forces. The force gauge shall have an accuracy of at least ± 0.1N (0.02 lbs) and range from zero to at least 27 N (6 lbs).

FP => If the activation force for any control exceeds 5 lbs, then the system fails, otherwise it passes.

3.3.4-E No Dependence on Direct Bodily Contact

Test Method: The tester shall proceed trough an entire voting session, using the conventional manual controls. The tester need not complete a vote for every contest, but must at least proceed trough ballot initiation, vote at least one conventional contest, vote for at least one write-in candidate, and perform final vote verification and casting. Troughout the session, the tester shall avoid direct bodily contact with the system. This can be done by use of a non-conductive probe and/or non-conductive gloves when manipulating the controls.

PF => If all of the controls respond properly, then the system passes, otherwise it fails.

3.3.5 Mobility

3.3.5-A Clear Floor Space

Test Method: As always, the tester shall ensure that the system has been set up according to the documentation supplied by the manufacturer. The tester shall then measure (using a conventional tape measure) the floor area intended for occupation by the voter. The area to be measured must be clear of all obstructions and overhanging elements.

F => If the measured area cannot contain a 30x48 inch rectangle, then the system fails.

The tester shall then determine whether the floor area is an integral part of the voting system, or whether this area is assumed by the system to be supplied as part of the polling place infrastructure.

If the area is integral to the system, the tester shall measure the slope as follows. Use a 24 inch level and a block of material exactly half an inch thick. Place the level on the floor and rotate it around the center of the area to determine the direction of slope if any. If there is a significant slope, place the block at the lowest point approximately 12 inches from the center of the area. Then place one end of the level on the block and the other end across the center from it (so that the level is along the diameter of a centered circle).

F => If the level is sloped downwards towards the block, then the system fails.

If the floor area is assumed by the system to be supplied as part of the polling place infrastructure, the tester shall examine the installation documentation to verify that it calls for a slope of no greater than 1:48.

F => If the installation documentation does not specify a floor area slope of 1:48 maximum, then the system fails.

3.3.5-B Allowance for Assistant

Test Method: As always, the tester shall ensure that the system has been set up according to the documentation supplied by the manufacturer and then approach the station in a wheelchair, oriented and located as intended by the manufacturer. An assistant shall attempt to accompany the tester in the voting area. Note that this area may be open or may comprise the inside of a shielded voting booth. The assistant may stand or be seated, as appropriate for the system.

F => If there is inadequate room for the assistant to enter or leave the area, then the system fails.

F => If there is inadequate room for the assistant to accompany the tester, then the system fails.

They then proceed trough the entire voting session (including ballot initiation, verification, and submission, as appropriate) using the default ballot choices.

Trough contest #9, the voting is accomplished by having the tester view the ballot and then give oral instructions to the assistant who carries them out (as if the tester were sighted but had dexterity disabilities). In the remaining contests, the voting is accomplished by having the assistant view the ballot and then give oral instructions to the tester who carries them out (as if the tester had vision disabilities, but not dexterity disabilities).

F => If either the tester or assistant has significant difficulty viewing the ballot or other relevant material, then the system fails.

The tester shall assess, based on the voting session, whether or not there are significant difficulties in executing the ballot (e.g. because needed controls are hard to reach or to manipulate). Execution includes ballot initiation, verification, and submission, as well as selection of candidates.

F => If either the tester or assistant has significant difficulty executing the ballot, then the system fails.

3.3.5-C Visibility of Displays and Controls

Test Method: The tester shall deploy the voting station according to the instructions of the manufacturer (including lighting) and approach the station in a wheelchair, oriented and located as intended by the manufacturer. The tester shall have vision no worse than 20/40 corrected. The tester shall determine if there is significant difficulty in seeing any of the controls, keys, or audio jacks, or in reading any of the labels, displays, or other elements of the voting station intended for the voter. Potential problems include inadequate font size or excessive glare.

FP => If there is significant difficulty in the visibility of elements intended for the voter, then the system fails, otherwise it passes.

3.3.5.1 Controls within Reach

3.3.5.1-A Forward Approach, No Obstruction

Test Method: The tester shall measure the high and low reach points of the voting station using a conventional tape measure. See Figure 3-1 for guidance.

PF => If both reach points meet the specifications, then the system passes, otherwise it fails.

3.3.5.1-B Forward Approach, with Obstruction

Test Method: This general requirement is tested specifically under its sub-requirements.

3.3.5.1-B.1 Maximum Size of Obstruction

Test Method: The tester shall measure the depth and height of the forward obstruction of the voting station using a conventional tape measure. See Figure 3-2 for guidance.

PF => If the depth and height meet the specifications, then the system passes, otherwise it fails.

3.3.5.1-B.2 Maximum High Reach over Obstruction

Test Method: The tester shall measure the high reach point of the voting station using a conventional tape measure. See Figure 3-2 for guidance.

PF => If the high reach point meets the specifications (based on obstruction depth, measured previously), then the system passes, otherwise it fails.

3.3.5.1-B.3 Toe Clearance under Obstruction

Test Method: The tester shall measure the toe clearance depth and width of the voting station using a conventional tape measure. See Figure 3-2 for guidance.

PF => If the toe clearance depth and width meet the specifications, then the system passes, otherwise it fails.

3.3.5.1-B.4 Knee Clearance under Obstruction

Test Method: The tester shall measure the knee clearance depth and width of the voting station using a conventional tape measure. See Figure 3-2 for guidance. The depth shall be measured at heights of 9, 18, and 27 inches.

PF => If all of the knee clearance measurements meet the specifications, then the system passes, otherwise it fails.

3.3.5.1-C Parallel Approach, No Obstruction

Test Method: The tester shall measure the high and low reach points of the voting station using a conventional tape measure. See Figure 3-3 for guidance.

PF => If both reach points meet the specifications, then the system passes, otherwise it fails.

3.3.5.1-D Parallel Approach, with Obstruction

Test Method: This general requirement is tested specifically under its sub-requirements.

3.3.5.1-D.1 Maximum Size of Obstruction

Test Method: The tester shall measure the depth and height of the side obstruction of the voting station using a conventional tape measure. See Figure 3-4 for guidance.

PF => If the depth and height meet the specifications, then the system passes, otherwise it fails.

3.3.5.1-D.2 Maximum High Reach over Obstruction

Test Method: The tester shall measure the high reach point of the voting station using a conventional tape measure. See Figure 3-4 for guidance.

PF => If the reach point meets the specifications (based on obstruction depth, measured previously), then the system passes, otherwise it fails.

3.3.6 Hearing

3.3.6-A Reference to Audio Requirements

Test Method: See tests for 3.3.3-C XREF "Audio Features and Characteristics".

3.3.6-B Visual Redundancy for Sound Cues

Test Method: The tester shall proceed trough an entire voting session, using the editable ballot session. The voting station shall be in full syncronized audio/visual mode. While voting for contest #2, the tester shall refrain from activity so as to cause the system to issue an inactivity alert (see requirement 3.2.6.1-E XREF). If at any time, an aural cue is used as a warning or alert (e.g. for inactivity or for attempted overvoting), there must also be a corresponding visual cue.

PF => If all aural cues are accompanied by visual cues, then the system passes, otherwise it fails.

3.3.6-C No Electromagnetic Interference with Hearing Devices

Test Method: The test methods to be used are fully documented in the ANSI standard as cited in the requirement.

PF => If the system meets the test criteria for category T4, then the system passes, otherwise it fails.

3.3.7 Cognition

3.3.7-A General Support for Cognitive Disabilities

Test Method: The features mentioned in the Discussion entry are tested as described in the cited sections.

3.3.8 English Proficiency

3.3.8-A Use of ATI

Test Method: See tests for 3.3.3-B XREF "Audio-Tactile Interface".

3.3.9 Speech

3.3.9-A Speech not to be Required by Equipment

Test Method: The tester shall proceed trough an entire voting session. The tester need not complete a vote for every contest, but must at least proceed trough ballot initiation, vote at least one conventional contest, and vote for at least one write-in candidate, and perform final vote casting. The tester shall verify that speech is never required to perform any of the functions of the system.

FP => If speech is required to perform any voting function, then the system fails, otherwise it passes.

Part 2: Combined-Requirement Test Methods (CRTMs) in Support of Usability and Accessibility Test Methods

Voting Performance Protocol (VPP)
Usability Testing by Manufacturer
Editable Ballot Session
Non-Editable Ballot Session
Privacy of Voting Session
Privacy of Cast Vote Record (CVR)
Language Clarity
Ballot Design
Default Characteristics
Font Characteristics
Use of Color
Scrolling and Feedback
Accidental Activation
Response Time
Inactivity Time
Alternative Languages
Operational Usability for Poll Workers
End-to-end Accessibility
Accessible Ballot Verification and Submission
Partial Vision
Audio-Tactile Interface
Audio Volume
Non-Manual Operation

Test Method: Voting Performance Protocol (VPP)

Covers requirements:

3.2.1.1-A Total Completion Performance
3.2.1.1-B Perfect Ballot Performance
3.2.1.1-C Voter Inclusion Performance
3.2.1.1-D Usability metrics from the Voting Performance Protocol
3.2.1.1-D.1 Effectiveness metrics for usability
3.2.1.1-D.2 Voting session time
3.2.1.1-D.3 Average voter confidence

This section describes the full Voter Performance Protocol (VPP) for the VVSG tests (those addressing section 3.2.1.1 XREF). A white paper by NIST on Usability Performance Benchmarks for the VVSG discusses the rationale behind many of the design decisions for the VPP.

The VPP is by far the most complex test within the Usability and Accessibility section. The general idea is to run a "mock" election under controlled conditions, and then derive metrics for the effectiveness, efficiency and satisfaction exhibited.

VPP Overview 1. Acronyms

The following acronyms are used troughout the VPP:

CI: Confidence interval - a statistical construct, expressing the degree of confidence in a specified accuracy of the result.
MW: Mann-Whitney test - used to detect whether there is a statistically significant difference between the current and nominal distribution of raw scores for the calibration system.
NIB: Number of invalid ballots - determined by the number of participants who responded on the post-test questionnaire that they did NOT try to follow instructions.
PBI: Perfect Ballot Index - effectiveness metric, as tested here.
TCS: Total Completion Score - effectiveness metric, as tested here.
VII: Voter Inclusion Index - effectiveness metric, as tested here.
VPP: Voter Performance Protocol - test method for usability performance
VSUT: Voting System under test - the system for which conformance to the VVSG is being evaluated.
VVSG: Voluntary Voting System Guidelines - the set of requirements against which one tests conformance by a VSUT.

VPP Overview 2. Test Method Calibration System (TMCS)

In any test of this type there exists a possibility that a particular run of the test is invalidated by a problem somewhere in its preparation (including participant recruitment), administration, or results analysis. In order to guard against such measurement errors, two systems are tested in parallel: the actual voting system under test (VSUT), and a so-called test method calibration system (henceforth referred to simply as the calibration system). The nominal (i.e. expected) effectiveness results for the calibration system have been previously established. Therefore, if the current results from the calibration system match these nominal results closely enough, then the testing is presumed to be valid; otherwise the test itself is rejected.

As an analogy, one might use a standard kilogram artifact to ensure that an instrument for measuring mass is operating correctly. Note that the purpose of calibration is to ensure consistency among tests. It is not assumed that, within a single test, the calibration system and the VSUT are similar, either in their architecture or their effectiveness. Further details on calibration may be found within the test description below.

As of September 2008, no system has been repeatedly measured so as to establish its effectiveness characteristics for the purpose of calibration. In order to illustrate how the testing will work when such results are known, this test method refers troughout to two fictitious calibration systems, named "System X" and "System Y".

System X (fictitious)
System Y (fictitious)

VPP Overview 3. Statistical Techniques and Software Support

Here are some references that discuss this test's various statistical techniques. The Perl code is included not only to allow computation but also to provide a detailed description of the algorithms being used. Many of the complex VPP procedures described below are supported by the end-to-end Perl scripts.

Mann-Whitney:
- Mann-Whitney U (Wikipedia) article
- Perl source code.
- link to Excel spread sheet ?
Adjusted Wald Method:
- Estimating Completion Rates from Small Samples Using Binomial Confidence Intervals: Comparisons and Recommendations
- On-line calculator
- Perl source code.
- link to Excel spread sheet ?
Capability index:
- What is Process Capability?
- Perl source code
- link to Excel spread sheet ?

VPP Overview 4. Key Variables

As we proceed trought the description, we shall refer to certain named quantities that must be observed or computed. These quantities are per-system, not for the two systems (VSUT and calibration) combined. Here is a summary:

Name Meaning Value Range

NPART Number of participants who attempt to vote on the system At least 100
NCAST Number of participants who successfully cast a ballot on the system 100 - NPART
NPERFECT Number of participants who successfully cast a perfectly correct ballot on the system. 0 - NCAST
NCORRECT-i Number of voting opportunities successfully taken by i-th participant 0-28
PCORRECT-i Proportion of voting opportunities successfully taken by i-th participant 0-1.00
TASKTIME-i Number of seconds taken by the i-th participant to complete the voting task. Typical value in the hundreds

Name	Meaning	Value Range
NPART	Number of participants who attempt to vote on the system	At least 100
NCAST	Number of participants who successfully cast a ballot on the system	100 - NPART
NPERFECT	Number of participants who successfully cast a perfectly correct ballot on the system.	0 - NCAST
NCORRECT-i	Number of voting opportunities successfully taken by i-th participant	0-28
PCORRECT-i	Proportion of voting opportunities successfully taken by i-th participant	0-1.00
TASKTIME-i	Number of seconds taken by the i-th participant to complete the voting task.	Typical value in the hundreds

VPP Overview 5. Role of Manufacturer

The system manufacturer may or may not observe the test, according to the practices of the test lab. However, no manufacturer representative may have any contact with participants before, during, or after the test.

VPP Overview 6. Protocol Steps

Here are the major steps of the Voting Performance Protocol:

1. Recruit and schedule participants
2. Set up environment
3. Set up voting systems
4. Prepare participants
5. Conduct the voting
6. Debrief participants
7. Data collection
8. Check calibration results
9. Analyze data
10. Report system results

VPP Step 1. Recruit and schedule participants

The test lab must be sure that it has met all Federal and state legal requirements for the use of human subjects.

For a valid test, there must be at least 100 participants who succeed in casting the ballot for each of the two systems. The test lab will typically have to "over-recruit" to allow for subjects who do not show up, who are ineligible for various reasons, who fail to cast a ballot, or who do not follow instructions. This is a between-subjects test - each participant uses either the VSUT or calibration system, not both. The participant population is limited to individuals who:

are US citizens eligible to vote
are literate in English
have no disabilities
have no significant connection to any manufacturer of voting systems - e.g. no close relative as an employee or owner
do not have a background in political science or computer science

Both pools of participants should be balanced according to certain demographic criteria, with the target distribution as follows:

Gender -- female: 55%; male: 45%;
Race -- African-American: 10%; Non-Hispanic White: 80%; Hispanic: 10%;
Education -- High school graduate: 25%; Some college: 35%; College Graduate: 30%; Post-Graduate degree: 10%;
Age -- 18-24 yrs: 10%; 25-34 yrs: 20%; 35-44 yrs: 25%; 45-54 yrs: 25%; 55-64 yrs: 20%

Whoever performs the recruiting (either the test lab itself, or a recruitment company) should try to achieve the target percentages presented above. However, even if the actual test population varies from these targets, the test is still to be considered valid as long as the results from the calibration system are satisfactory.

Here is an example of a screening questionnaire that may be useful for recruitment.

The participants should be scheduled for staggered arrival at the testing site, so as to avoid excessive waiting. Since the voting session itself can easily take 10 minutes, it would be reasonable to separate participant arrivals by about 15 minutes, but the optimum interval depends strongly on the system being tested. See the page on "Performance Timing" in the benchmark data gathered earlier to get a sense of the range of voting times.

VPP Step 2. Set up environment

The goal, as far as possible, is to simulate a high quality polling place. Thus, any errors detected will not be traceable to extraneous environmental factors. There must be sufficient room in which to carry out the mock voting, using at least two voting stations. The voting area should have the following characteristics:

Size: minimum 12' by 15' by 8' high.
Ambient lighting should be in the range of 400-600 lx. If possible, use indirect lighting rather than overhead fixtures or direct sunlight so as to reduce glare.
Ambient noise levels should be below 40dB
Ventilation should be such as to avoid either a "stuffy" or "drafty" feeling.
Temperature should be between 68 and 76 Faheit
Relative Humidity should be between 20% and 60%

See this OSHA guideline for more detailed recommendations. This University of Wisconsin webpage is also useful.

VPP Step 3. Set up voting systems

The VPP, as well as many other usability and accessibility tests, requires the manufacturer to set up the voting system with a ballot based on the NIST standard ballot specification. The manufacturer is responsible for the actual ballot design (fonts, layout, etc). Once this ballot has been loaded on both the systems (the VSUT and calibration system), the test lab must set these up as described in their documentation and prepare them to receive votes. You may wish to consolidate this step with the test for poll worker usability.

VPP Step 4. Prepare participants

The test facilitators are responsible for preparing the participants for the test procedure. Follow these steps for each participant:

Greet incoming participants and verify that they are here for the appropriate purpose.
Administer a consent and release form, as appropriate. Here is an example of the form NIST used during development of the test, but this should be customized to suit the test lab. Usually, you would witness the participant's signature and then sign the form yourself as a witness.
Hand each participant a copy of the voting instructions and have him or her read it over.
Do not coach the participant on strategies for voting or on how to use the voting system. The goal is to minimize, if not eliminate entirely, any "facilitator effect".
Finally, escort the participant to the system (either the VSUT or the calibration system) to begin voting.

VPP Step 5. Conduct the voting

The two systems (the VSUT and calibration system) are to be tested in parallel. The test participants are not informed which system is being considered for certification.

Depending on the type of system, the facilitator may be required to start the voting process by enacting the role of the poll worker.

The facilitator will need to balance the dual goals of

enacting the poll worker role in a limited form as it pertains to the running of real elections and
refraining from otherwise interacting with the participant during their voting session so as to adhere to the testing protocol.

Thus, as a mock poll worker, a facilitator should interact with participants as is appropriate for the system (e.g., perhaps handing out paper ballots individually for opscan systems). In such a scenario, it is assumed that it is completely valid (with respect to the VPP) for a participant to ask for a new paper ballot if he/she spoils the current one (just as would occur in real life).

Aside from performing the limited role of a mock poll worker, the facilitators cannot interact with the participant during the actual voting session. Any interaction between the participant and the facilitator (outside the specified roles) once testing has started would invalidate the result from that participant. It is acknowledged that this is not the usual practice in real elections, where poll workers are available to assist voters if they have problems. However, this limitation is necessary to ensure valid and reliable data, since we need to eliminate the helpfulness of the facilitators as a factor. If the facilitator is asked for assistance, this should be the reply:

"I'm sorry. I can't provide you with any help. Please do the best you can. If you are stuck and cannot continue, you can stop."

For each system, there should be two "observers". They are responsible for determining two items of data via observation of each participant: 1) whether or not the participant successfully casts a ballot (regardless of whether the ballot choices are correct) and 2) the time taken to vote.

VPP Step 5.1 Successful Casting

If the participant fails to successfully cast a ballot, this should be noted, as this class of error will be counted separately from accuracy, timing, and satisfaction data. The participant counts as part of NPART but not of NCAST.

Examples of such failure include simply abandoning the system, or leaving the system under the mistaken impression that the ballot has been cast. If the observer needs to cast the ballot in order to clear the system for the next participant, this should be done as soon as it is obvious that the participant has abandoned the session, but note that this ballot does not count when computing accuracy metrics (perfect ballot index and voter inclusion index). One way to achieve this is for the observer to mark the ballot by writing in a vote for "DONOTUSE" as Governor, so that the ballot can be identified later on. Non-cast ballots are also excluded from timing and satisfaction data.

VPP Step 5.2 Collecting timing data during testing

The observers are responsible for timing. If the facilitator is required to initiate the voting session as a poll worker, timing will begin when the facilitator has completed initiation. If the participant starts the voting session, timing will begin when the participant reaches the system. Note that there may be a delay before the participant actually commences the first step (i.e., enters the activation card, receives the paper ballot, etc.); this delay is considered to be part of the session to be timed, since the time taken to understand how to begin is significant.

Once the participant has completed the voting task, the observers will record the elapsed time. What constitutes completion of the voting session depends on the type of system. For simple DREs, walking away from the system signals end of session. Other systems may involve the verification of a paper record or submittal of a paper ballot to an opscan device. Participants should use the system as it is intended to be used in normal practice. The recorded time is TASKTIME-i.

VPP Step 5.3 Sufficient number of ballots

As soon as 105 ballots have successfully been cast for the VSUT and for the calibration system, the test may be terminated. This ensures at least 100 valid ballots for the analysis even if there are as many as five "invalid ballots". The following procedures, however, do not assume that NCAST equals exactly 100, although that is the most typical case. Note that the statistical reliability of the test depends upon there being at least 100 validly cast ballots for each system.

VPP Step 6. Debrief participants

Once the participant has completed the voting session, a facilitator administers the post-test questionnaire, to be filled out by the participant him/herself. The only questions are (1) if they tried to follow the instructions telling them how to vote, (2) a question on confidence in their performance, and (3) a question on how well they liked the system.

VPP Step 6.1 "Bad-faith" Participants and the Validity of the VPP

If some participants answer "No" to the first question, it means that some of the collected data is invalid (since the test is based on the assumption that participants are actually attempting to accomplish the voting task). The resulting set of cast ballots may well reflect lower effectiveness scores than would be the case if all participants had made a good faith effort. The problem is to be handled as follows:

If all participants answered "Yes" (that they did try to follow instructions), then proceed with Data collection below.
If more than five participants answered "No" (that they did not make a good-faith effort to follow instructions) for either the VSUT or calibration system, then this run of the VPP test method must be abandoned, since the excessive number of "bad-faith" participants indicates a serious problem with execution of the test.
There are between one and five "bad-faith" participants for either or both the VSUT and calibration system. If you can reliably identify their ballots (because, for instance, you are keeping track of which paper ballots are associated with which participants), remove those ballots from further consideration and then proceed with Data collection below.
There are one to five invalid ballots within the VSUT or calibration set of cast ballots, but it cannot be determined which ones they are. These invalid ballots potentially affect:
- The Mann-Whitney analysis of the calibration system
- The PBI of the VSUT
- The VII of the VSUT
For such a set, you may need to evaluate certain metrics two ways: 1) using the original set of ballots (including the unknown invalid ballots) and 2) using a best-case set of ballots, with the NIB worst-scoring ballots removed, where NIB = the number of invalid ballots. The NIB "worst" ballots are removed as a proxy for removing the NIB invalid ballots. The resulting adjusted set represents an upper bound (best-case) for the effect of the invalid ballots. This is to ensure against "false failure" of a VSUT.
For the calibration system: run the Mann-Whitney analysis using both the original and best-case set of ballots. If both analyses yield a z-score outside the normal 95% CI, then this run of the VPP test method is invalid and must be abandoned.
For the VSUT:
- Evaluate the PBI and VII metrics, using the original ballot set. If the system passes both benchmarks, then no further compensation for invalid ballots is needed. The point is that if the system passes these tests even when "held back" by the invalid ballots, it may be assumed to have "really" passed.
- If the system fails either metric using the original ballot set, then re-run the evaluation using the best-case ballot set. If the system still does not meet the associated benchmark, then it fails.
- If the system meets a benchmark (either PBI or VII) using the best-case set, but does not meet it using the original set, then the result is indeterminate, and the VPP test method must be abandoned.
If the VPP test method is abandoned as invalid, it must be repeated later with a new set of participants.

Here is an abstract "pseudo-code" summary of the logic for handling invalid ballots:

Let:
NIB  = #invalid ballots
OSB  = original set of ballots
BCSB = "best-case" set of ballots: NIB worst removed

if NIB = 0
   use OSB for all purposes;
   proceed with VPP;
elsif NIB > 5 (for either VSUT or calibration system)
   abandon VPP - too many invalid ballots;
else (0 < NIB < 6)
   if you can identify the NIB invalid ballots 
      remove the NIB invalid ballots from set;
      use cleaned-up set for all purposes;
      proceed with VPP;
   else (you cannot identify the NIB invalid ballots)
      if ballot set is for calibration system
         evaluate MW using both OSB and BCSB;
         if neither matches nominal results
            calibration system indicates invalid test;
            abandon VPP;
         else
            calibration system indicates valid test;
            proceed with VPP;
         endif
      else (ballot set is for VSUT)
         compute PBI and VII result using OSB;
         if benchmarks met
            system passes;
         else [result of OSB is failure]
            recompute PBI and/or VII result (whichever failed) using BCSB;
            if new result is still failure
               system fails that benchmark;
            else (new result is pass)
               situation indeterminate: abandon VPP;
            endif
         endif
      endif
   endif
endif

The end-to-end Perl scripts (as described here), can perform this procedure.

VPP Step 6.2 Stipend Paid to Participants

Finally, a facilitator provides the participant with his/her compensation ($50.00) and thanks the participant for his/her time.

VPP Step 7. Data collection

For each participant/ballot, the following basic data needs to be collected. Note that it might not be possible to associate each participant with his/her ballot.

Which system did he/she exercise (VSUT/calibration)
Successful ballot casting? (yes/no)
If ballot was cast, number of voting opportunities successfully taken (0-28);
Time on task, in seconds.
Response to post-test question on "effort" (yes/no)
Response to post-test question on confidence (1-5)
Response to post-test question on likability (1-5)

These are all straightforward, except for counting the number of voting opportunities (NCORRECT-i) successfully taken. The counting procedure is described below.

VPP Step 7.1 Scoring Ballots

First, you must use the ballot as recorded, not as marked. That is, the result of the test is the internal electronic record of the ballot (whether resulting from a DRE or from optical scanning of paper or the like), not the DRE screen or the paper ballot as such. Note that an overvoted contest on a paper ballot usually results in no votes recorded for that contest. Also, the interpretation of a straight party vote together with votes for individual candidates is not always obvious.

For each ballot, you must count up the number of voting opportunities correctly executed, with 28 being a perfect score. Seventeen contests (not including the straight party contest) are vote-for-1, accounting for 17 points. In addition, there is a vote-for-5, a vote-for-2, and a vote-for-4 contest, accounting for the other 11 points.

Please note that a selection in the straight party contest does not contribute directly to the point total. Rather it is only the effect of that contest in selecting actual candidates (as reflected in the ballot-as-recorded) that is considered.

For all write-in choices, if the name is spelled exactly as given in the instructions, it is a correct vote, otherwise not. However, you should ignore extra spaces and any upper/lower case distinction when checking spelling.

Most of the contests on the ballot are vote-for-1, and for these the counting is simple: If the participant voted for the correct choice (and no one else), count it as 1, otherwise 0. If the instructions were to not vote (undervote) that contest, then, if the contest was unvoted, count it as 1, otherwise 0.

There are t vote-for-N contests. The County Commissioners contest is vote-for-5, so start with a score of 5. For each of the 5 instructed commissioners not voted for, subtract 1. For each un-instructed commissioner who was voted for, subtract 1. If the result is less than 0, count it as 0.

Likewise, the Water Commissioners contest is vote-for-2, so start with a score of 2. For each of the 2 instructed commissioners not voted for, subtract 1. For each un-instructed commissioner who was voted for, subtract 1. If the result is less than 0, count it as 0.

Finally, the City Council contest is vote-for-4, so start with a score of 4. For each of the 3 instructed commissioners (the instructions deliberately call for undervoting) not voted for, subtract 1. For each un-instructed commissioner who was voted for, subtract 1. If the result is less than 0, count it as 0.

Thus, the total raw score for each ballot is a number from 0 to 28 (which we will call NCORRECT-i). From this we immediately derive a scaled score for each ballot:

   PCORRECT-i = NCORRECT-i / 28

VPP Step 7.2 Software Support

The end-to-end Perl scripts (as described here), can perform this procedure.

VPP Step 7.3 Retention of Records

All records pertaining to the test data (whether created by the voting system or by the test facilitator) should be stored safely and privately for future reference. The purpose is twofold: first to protect participant privacy, and second to allow any questions about the test results to be resolved based on direct evidence.

VPP Step 8. Check calibration results

In order to ensure the validity of the testing procedure, the current results from the calibration system are compared to its nominal results. The nominal results have been previously validated as truly representative of the performance of the calibration system. Therefore, if the current results differ significantly from the nominal results, the entire test is rejected as invalid. In such a case, no conclusions can be drawn about whether the VSUT does or does not meet the requirements in section 3.2.1.1 XREF, i.e. the VSUT neither passes nor fails these requirements.

VPP Step 8.1 Total Completion Score (TCS)

It has been determined that the following Total Completion Scores represent typical performance of the established calibration systems.

System X: nominal TCS = 100/102 = 0.9804
System Y: nominal TCS = 100/105 = 0.9524

If the 95% confidence interval (CI) for the current results from the calibration system do not contain the appropriate value, then the results of this execution of the VPP are invalid and must be ignored.

For example, suppose you are using System X (nominal TCS = 0.9804) as the calibration system and the current results are 100/105 (i.e. 100 successes in 105 attempts). Using the Adjusted Wald Method (click for online calculator), we find the 95% CI for this score to be [0.8906, 0.9823]. Since this CI contains the target value of 0.9804, there is no strong reason to assume that the current procedure is aberrant. However, a current TCS of 100/106 yields a CI of [0.8795, 0.9763], which would indicate lack of validity because it does not contain the target value.

As another example, if you were using System Y (nominal TCS = 0.9524) as the calibration system, and your current results were 100/100 (no failures to cast ballot), the resulting CI of [0.9683, 1.0000] would also indicate an invalid test.

There are Perl scripts to support the TCS calculation.

VPP Step 8.2 Mann-Whitney Analysis of Raw Scores.

The Mann-Whitney test compares the distribution of nominal raw scores against the distribution of current scores. If these are sufficiently similar, the test procedure is assumed to be valid.

The test involves computing a so-called "U" score as a result of comparing the distributions. Take all pairs of scores (one from the current set and one from the nominal set). For each pair in which the current score is less than the nominal score, add 1 to U. For each pair in which the scores are equal, add 1/2 to U. The mean and standard deviation for U can be used to derive a z-score:

   Let ns = number of nominal scores 
   Let cs = number of current scores 
   
   U_mean = ns * cs / 2;

                       ns * cs * (ns + cs + 1) 
   U_std_dev = sqrt ( ------------------------- )
                                12

   z-score = (U - U_mean) / U_std_dev

If the z-score is outside the normal 95% CI (that is, not between -1.96 and +1.96), the two distributions are different enough to indicate an invalid test. For instance, suppose that we have exactly 100 values for both distributions and that we compute U = 5739.5, as described above. The mean for U is 5000, and the standard deviation is 409.27, yielding a z-score of 1.8069. Since this lies within the normal CI of [-1.96, 1.96], the test may be assumed to be valid.

There are Perl scripts to support the Mann-Whitney calculation.

VPP Step 9. Analyze data

After ensuring that the results from the calibration system do not indicate invalid test results, we now analyze the results from the VSUT and compare them against the benchmarks set out in section 3.2.1.1 XREF of the VVSG.

VPP Step 9.1 Effectiveness: Total Completion Score (TCS)

The TCS is calculated simply as the ratio of the number of participants who successfully cast a ballot to the number of participants who attempted to vote on the system, i.e. TCS = NCAST / NPART. For instance, if 106 subjects attempted to vote, and 4 failed to cast the ballot, the TCS = 102/106 = 0.9623, and the associated 95% CI = [0.9039, 0.9883]. This CI is derived using the Adjusted Wald formula, which you can compute by entering the numerator and denominator into this online calculator or by using this Perl script. Of course, the Perl script also allows you to inspect the details of the computation.

FP => If the high end of the TCS CI is less than the benchmark value of 98%, then, for requirement "Total completion performance", the system fails, otherwise it passes.

Note that since this (and the following tests) are "one-sided" (failure occurs only if the benchmark is on the high side of the CI), it is even more conservative than implied by the figure of 95% for the CI. The probability of "false" failure is at most 2.5%. And of course, the farther the true value is below the benchmark, the lower that probability.

The end-to-end Perl scripts (as described here), perform the TCS calculation.

VPP Step 9.2 Effectiveness: Perfect Ballot Index (PBI)

The PBI is the ratio of the number of cast ballots containing no erroneous votes (i.e. raw score = 28) to the number of cast ballots containing one or more errors (raw score < 28), i.e. PBI = NPERFECT / (NCAST - NPERFECT). In the following example, let us assume there are 60 perfect ballots and 40 imperfect ballots - a measured PBI of 1.5 (60 / 40).

Apply the Adjusted Wald formula to the number of perfect ballots (successes = 60) and the number of all cast ballots (100). Let H = the high end of the resulting 95% CI (0.6907).
Therefore the high end of the CI is equivalent to a PBI of H / (1 - H) = 0.6907 / 0.3093 = 2.233.

FP => If the high end of the PBI CI is less than the benchmark value of 2.33, then, for requirement "Perfect ballot performance", the system fails, otherwise it passes.

The end-to-end Perl scripts (as described here), perform the PBI calculation.

VPP Step 9.3 Effectiveness: Voter Inclusion Index (VII)

The VII is based on the set of accuracy scores (PCORRECT-i) for all the participants who cast their ballots.

First compute the mean (VII_M) and standard deviation (VII_SD) for all the PCORRECT-i scores.
The VII is calculated as a capability index (see "What is Process Capability?" ). Set VII = (VII_M - 0.85) / (3 * VII_SD).

We calculate the high end of the 95% CI by adding to VII:

                  1          VII ** 2
1.96 * sqrt ( --------- + ------------- )
              9 * NCAST   2 * (NCAST-1)

For example, suppose we had 100 cast ballots with an average accuracy of 0.93 and a standard deviation of 0.11. The measured VII would then be 0.242, with a 95% CI of [0.169, 0.316]. Since the entire CI is below the benchmark of 0.35, the system in this example would fail.

FP => If the high end of the VII CI is less than the benchmark value of 0.35, then, for requirement "Voter inclusion performance", the system fails, otherwise it passes.

The end-to-end Perl scripts (as described here), perform the VII calculation.

VPP Step 9.4 Efficiency: Average Time on Task

We consider only those participants who successfully cast their ballots. The average time on task is calculated simply as the sum of times taken (TASKTIME-i) divided by their number (NCAST).

VPP Step 9.5 Satisfaction: Confidence and Likability

There are two satisfaction metrics, measured using a Likert scale of 1-5. These are both calculated as simple means: the sum of scores for all those who cast ballots, divided by NCAST.

VPP Step 10. Report system results

"Usability metrics from the Voting Performance Protocol" (3.2.1.1-D XREF) and its sub-requirements apply to the test lab, not to the VSUT. Therefore, it is not a pass/fail requirement. Rather, it directs the test lab to report all the metrics described above for effectiveness, efficiency, and satisfaction to the EAC as part of the test report. Report items should include:

Calibration system data - current results
- Identification (make and model) of the calibration system
- NPART and NCAST
- Measured TCS = NCAST / NPART
- 95% CI for TCS
- Distribution of raw scores
- Z-score resulting from Mann-Whitney comparison of current and nominal distributions
Effectiveness Metrics for the VSUT
- NPART and NCAST
- Measured TCS = NCAST / NPART
- 95% CI for TCS
- Distribution of raw scores
- NPERFECT
- Measured PBI = NPERFECT / (NCAST - NPERFECT)
- 95% CI for PBI
- Mean and standard deviation for the distribution of scaled scores (PCORRECT-i)
- Measured VII
- 95% CI for VII
Efficiency Metrics for the VSUT
- Mean for distribution of TASKTIME-i
- Standard deviation for distribution of TASKTIME-i (optional)
Satisfaction Metrics for the VSUT
- Mean for distribution of confidence responses
- Mean for distribution of likability responses

Test Method: Usability Testing by Manufacturer

Covers requirements:

3.2.1.2-A Usability Testing by Manufacturer for General Population
3.2.7-A.4 Usability Testing by Manufacturer for Alternative Languages
3.2.8.1-B Usability Testing by Manufacturer for Poll Workers
3.3.2-A Usability Testing by Manufacturer for Voters with Low Vision
3.3.3-A Usability Testing by Manufacturer for Blind Voters
3.3.4-A Usability Testing by Manufacturer for Voters with Dexterity Disabilities

A usability expert who is familiar with the Common Industry Format (CIF) shall examine the TDP to ensure the existence and adequacy of the test report submitted by the manufacturer. The expert shall verify that the report conforms to the formatting and content requirements of the CIF. The expert shall verify that the demographic characteristics of the subject pool meet the specifications of the particular requirement. Note that there are no requirements pertaining to the quantitative results of the test.

Most of the usability tests are oriented towards voters, and accordingly, the tasks within the test must encompass some voting activity. The usability tests for poll workers must encompass setup, operation, and shutdown of the system.

Unlike other CRTMs, this is not a test method that gets executed once in order to cover several requirements, but rather a test method that gets executed once per requirement.

F => If the formatting or content does not conform to the CIF, then, for requirement "Usability Testing by Manufacturer", the system fails.

F => If the subject pool does not conform to the required demographic characteristics, then, for requirement "Usability Testing by Manufacturer", the system fails.

F => If the tasks are not relevant (for voters or poll workers, as appropriate), then, for requirement "Usability Testing by Manufacturer", the system fails.

Test Method: Editable Ballot Session

Covers requirements:

3.2.2-D Notification of Ballot Casting
3.2.2.1-A Prevention of Overvotes
3.2.2.1-B Warning of Undervotes
3.2.2.1-C Independent Correction of Ballot
3.2.2.1-D Ballot Editing per Contest
3.2.2.1-E Contest Navigation

If the VSUT has an audio interface (i.e. within class VEBD-A), this test method must be enacted for both the visual and audio interface.

The tester shall fill out the ballot using the default ballot choices, except as follows:

While voting contest #2 (US Senate), the tester shall first indicate a vote for Dennis Weiford and then change it to Lloyd Garriss and this change must be possible before advancing to the next contest.
Just before voting in contest #8 (State Assemblyman), navigate sequentially backward to contest #5 (Lieutenant-Governor), and then forward to contest #8 again. It should be possible to see and modify the votes cast in contests #5, #6, and #7, (Lieutenant-Governor, Registrar of Deeds, and State Senator).
No votes are to be indicated in contest #11 (Water Commissioners).

PF => If the editing within contest #2 can be performed, then, for requirement "Ballot Editing per Contest", the system passes, otherwise it fails.

PF => If the navigation among contests #5, 6, 7, and 8 can be performed, then, for requirement "Contest Navigation", the system passes, otherwise it fails.

After initial completion of the ballot, the tester shall attempt to add a vote for John Hewetson in contest #2 (US Senate). This must be done without "clearing" the prior vote for Lloyd Garriss. The system may either refuse to accept the new vote or may change the selection from Garriss to Hewetson, but may not indicate a vote for both. Then the tester shall attempt to add a vote for Harvey Eagle in contest #12 (City Council). Again, the system may either refuse the new selection or change an old one, but it may not indicate the addition of a 5th vote.

FP => If the system at any point indicates more votes within a contest than allowed, then, for requirement "Prevention of Overvotes", the system fails, otherwise it passes.

The tester shall then attempt to add a vote for candidate Orville White in contest #11 (Water Commissioner) and then proceed to a point just prior to final casting of the ballot.

F => If by this time no warning has been given about undervoting in contest #11, then, for requirement "Warning of undervotes", the system fails.

If there has been a warning, return to contest #11 and add a vote for Gregory Seldon, so that the contest is no longer undervoted. Then withdraw the vote for Sheila Moskowitz in contest #9 (County Commissioner) and again proceed to the point just prior to final casting of the ballot.

F => If by this time no warning has been given about undervoting in contest #9, then, for requirement "Warning of undervotes", the system fails.

The tester shall then attempt to change the vote in contest #7 (state senate) from Marty Talirico to Edward Shiplett.

F => If this change (or any of the previous changes) cannot be done autonomously, then, for requirement "Independent Correction of Ballot", the system fails.

Finally, the tester shall follow the system's instructions so as to cast the ballot (including final review and/or verification, as available). Upon doing so, the system must notify the voter that the ballot has been cast successfully.

PF => If the system notifies the voter that the ballot was cast, then, for requirement "Notification of Ballot Casting", the system passes, otherwise it fails.

Test Method: Non-Editable Ballot Session

Covers requirements:

3.2.2-D Notification of Ballot Casting
3.2.2.2-A Notification of Overvoting
3.2.2.2-B Notification of Undervoting
3.2.2.2-D Ballot Correction or Submission Following Notification

There are t sub-tests to be carried out in succession. The tester shall set up the VSUT to be in each of these states:

Warn about all overvoting and all undervoting
Warn about all overvoting and undervoting only for City Council (contest #12)
Warn about all overvoting but not undervoting

F => If the system cannot configured to these t states, then, for requirement "Notification of Undervoting", the system fails.

For each of these conditions, the tester shall fill out the ballot in the standard way, except as follows:

When voting contest #2 (US Senate), the tester shall indicate a vote for both Dennis Weiford and Lloyd Garriss (overvote).
When voting contest #8 (State Assemblyman), the tester shall not indicate a vote for any candidate (undervote).
When voting contest #11 (Water Commissioner), the tester shall write in a vote for Bob Johnson, as well as voting for both Orville White and Gregory Seldon (overvote).
When voting contest #12 (City Council), the tester shall vote only for Donald Davis, Hugh Smith, and Reid Feister (undervote).
When voting Retention Question #1 (Retain Robert Demergue as Chief Justice) the tester shall mark neither the "yes" nor "no" boxes (undervote).
When voting Referendum #2 (PROPOSED CONSTITUTIONAL AMENDMENT D) the tester shall mark both the "yes" and "no" boxes (overvote).
When voting Referendum #4 (PROPOSED CONSTITUTIONAL AMENDMENT K) the tester shall mark neither the "yes" nor "no" boxes (undervote).

For each of the t sub-tests, the system must issue the appropriate warnings. It must always warn about overvoting in exactly these contests:

Contest #2 (US Senate),
Contest #11 (Water Commissioner),
Referendum #2 (PROPOSED CONSTITUTIONAL AMENDMENT D)

F => If the system does not consistently warn about all t of these contests being overvoted, then, for requirement "Notification of Overvoting", the system fails.

For sub-test #1 (all undervote warnings are enabled) it must warn about undervoting in exactly these contests:

Contest #8 (State Assemblyman),
Contest #12 (City Council),
Retention Question #1 (Retain Robert Demergue as Chief Justice)
Referendum #4 (PROPOSED CONSTITUTIONAL AMENDMENT K)

F => If the system does not issue undervote warnings for exactly these four contests, then, for requirement "Notification of Undervoting", the system fails.

For sub-test #2 (undervote warning for contest #12 only) it must warn about undervoting for just that contest.

F => If the system does not issue an undervote warning for exactly that contest, then, for requirement "Notification of Undervoting", the system fails.

For sub-test #3 (all undervote warnings are disabled) it must not issue any undervote warnings.

F => If the system issues any undervote warning for sub-test #3, then, for requirement "Notification of Undervoting", the system fails.

At the conclusion of each sub-test, the tester shall attempt final casting of the ballot. The system must then give the tester the opportunity to correct his/her ballot. Typically, an optical scanner would return the paper ballot for correction, although other mechanisms may be possible.

F => If no such opportunity for ballot correction is given, then, for requirement "Ballot Correction or Submission Following Notification", the system fails.

If allowed to correct, the tester shall mark the "yes" box for Referendum #4 (PROPOSED CONSTITUTIONAL AMENDMENT K), so as to correct that one undervote, and then re-submit the ballot. For each sub-test, the system must again warn about the uncorrected overvotes as above.

F => If the system does not consistently warn about the t overvoted contests, then, for requirement "Notification of Overvoting", the system fails.

For the first two sub-tests, the system must again warn about the uncorrected undervotes as above. That is, in sub-test #1, it must warn about Contest #8 (State Assemblyman), Contest #12 (City Council), and Retention Question #1; and in sub-test #2 it must warn about Contest #12 (City Council) only.

F => If the system does not warn about the uncorrected undervotes in sub-test #1 and #2, then, for requirement "Notification of Undervoting", the system fails.

F => If the system issues any undervote warning for sub-test #3, then, for requirement "Notification of Undervoting", the system fails.

The tester shall then attempt to submit his/her ballot without further correction (i.e. for all sub-tests, there is only one attempt to correct).

F => If the system refuses to accept final casting of the ballot, then, for requirement "Ballot Correction or Submission Following Notification", the system fails.

PF => If system notifies the voter that the ballot has been cast successfully, then, for requirement "Notification of Ballot Casting", the system passes, otherwise it fails.

Test Method: Privacy of Voting Session

Covers requirements:

3.2.3.1-A System Support of Privacy
3.2.3.1-A.1 Visual Privacy
3.2.3.1-A.2 Auditory Privacy
3.2.3.1-A.3 Privacy of Warnings
3.2.3.1-A.4 No Receipts

The system shall be set up using a layout compatible with the manufacturer's instructions. The layout includes the position and orientation of the equipment in relation to other polling place activity, such as a check-in desk, location of poll workers and judges, and of waiting voters.

This test requires two testers, a "voter" and a "bystander". The "voter" shall proceed trough an entire voting session. For editable interfaces, use the editable ballot session above, for non-editable interfaces, use the non-editable ballot session (with warnings enabled for overvote and undervote). Note that the latter potentially includes submitting a ballot to a scanner and then correcting it and re-submitting. The voter should follow the instructions for voting as given by the system, including e.g. procedures for changing a ballot or for the use of a privacy sleeve. The point is to see whether privacy is violated even if the voter acts conscientiously.

In the case of an Acc-VS, the session must be enacted t times, using:

the conventional visual-tactile interface
the audio interface
the syncronized audio/visual interface with wheelchair access and the non-manual controls provided for voters with dexterity disabilities.

The "bystander" shall approach the voting station as closely as would typically be allowed in a polling place environment, if the bystander were an election official or another voter. A bystander would typically not be allowed to stand right next to the voter. The bystander attempts to determine any of the "voter's" choices trough either visual or auditory cues. This attempt continues troughout the entire voting session, including ballot verification (e.g. as with a VVPAT system) and casting (not just when the voter is at the voting station).

FP => If the "bystander" can discover any voter choices via visual cues, then, for requirement "Visual Privacy", the system fails, otherwise it passes.

FP => If the "bystander" can discover any voter choices via auditory cues, then, for requirement "Auditory Privacy", the system fails, otherwise it passes.

FP => If the "bystander" can discover any voter choices via warnings, then, for requirement "Privacy of Warnings", the system fails, otherwise it passes.

FP => If the "bystander" can discover any voter choices by any other plausible means, then, for requirement "System Support of Privacy", the system fails, otherwise it passes.

FP => If the system issues a receipt whereby a voter could prove to another party how he or she voted, then, for requirement "No Receipts", the system fails, otherwise it passes.

Test Method: Privacy of Cast Vote Record (CVR)

Covers requirements:

3.2.3.2-A No Recording of Alternative Languages
3.2.3.2-B No Recording of Accessibility Features

This test should be run after the voting sessions that test for alternative languages (XREF Section 3.2.7) and for access by blind voters (XREF section 3.3.3). Also, if there is no electronic CVR for the system, then this test does not apply.

The tester shall examine the TDP for the system and determine the format of the electronic Cast Vote Record (CVR) to ensure that no accessibility data or alternative language data is part of the CVR design.

F => If the format of the CVR includes information on the language used by the voter, then, for requirement "No Recording of Alternate Languages", the system fails.

F => If the format of the CVR includes information on the accessibility features used by the voter, then, for requirement "No Recording of Accessibility Features", the system fails.

The tester shall examine a representation of the CVR generated by other voting sessions that tested for alternative languages and for access by blind voters, in which such data was potentially generated, and verify that such data was not recorded.

FP => If alternative language data was preserved in the CVR, then, for requirement "No Recording of Alternate Languages", the system fails, otherwise it passes.

FP => If accessibility data was preserved in the CVR, then, for requirement "No Recording of Accessibility Features", the system fails, otherwise it passes.

Test Method: Language Clarity

Covers requirements:

3.2.4-C Plain Language
3.2.4-C.1 Clarity of Warnings
3.2.4-C.2 Context before Action
3.2.4-C.3 Simple Vocabulary
3.2.4-C.4 Start Each Instruction on a New Line
3.2.4-C.5 Use of Positive
3.2.4-C.6 Use of Imperative Voice
3.2.4-C.7 Gender-based Pronouns

This section describes the assessment of language usage in voting system documentation for the VVSG tests. The Guidelines for Writing Clear Instructions and Messages for Voters and Poll Workers provide a basis for determining whether a given system's documentation is written at a professionally-recognized level of quality.

Two experts in the use of plain language shall proceed trough an entire voting session and check the clarity of instructions and warnings intended for the voter. General principles and best practices known to the experts shall be used as criteria, as well as the sub-requirements of 3.2.4-C XREF. Note that these sub-requirements are all recommendations ("should") and are not to be treated as absolutes. E.g. one instance of the use of passive voice does not necessarily ensure failure of the (mandatory) main requirement.

For editable interfaces, use the editable ballot session above, for non-editable interfaces, use the non-editable ballot session (with warnings enabled for overvote and undervote).

If the VSUT has an audio interface (i.e. within class VEBD-A), the editable ballot session must be enacted for both the visual and audio interface.

In a real election, some of the messages intended for the voter originate with the voting system and some are mandated by election law. The VVSG requirements apply only to the former. However, in this testing situation, the system implements the NIST standard test ballot specification, which does not place constraints on the wording to be used for instructions and warnings and so these are all subject to scrutiny.

FP => If the system instructions are unclear enough that voters would have significant difficulty understanding warnings, notices, or instructions, then, for requirement "Plain Language", the system fails, otherwise it passes.

As the language review is taking place, the experts should also note violations of the detailed sub-requirements, even though these are not mandatory.

Warnings and alerts issued by the voting system should state: a. the nature of the problem; b. whether the voter has made a mistake or whether the voting system itself has malfunctioned; and c. the set of responses available to the voter.

PF => If all warnings or alerts clearly address these t aspects, then, for requirement "Clarity of Warnings", the system passes, otherwise it fails.

PF => If system instructions first state the condition, and then the action to be taken, then, for requirement "Context before Action", the system passes, otherwise it fails.

PF => If system instructions use familiar words and avoid technical or specialized words, then, for requirement "Simple Vocabulary", the system passes, otherwise it fails.

PF => If each logically distinct instruction starts on a new line, then, for requirement "Start Each Instruction on a New Line", the system passes, otherwise it fails.

PF => If system instructions generally state what to do, rather than what to avoid, then, for requirement "Use of Positive", the system passes, otherwise it fails.

PF => If system instructions directly address the voter, then, for requirement "Use of Imperative Voice", the system passes, otherwise it fails.

PF => If system instructions avoid the use of gender-specific pronouns, then, for requirement "Gender-based Pronouns", the system passes, otherwise it fails.

Test Method: Ballot Design

Covers requirements:

3.2.4-E Ballot Design
3.2.4-E.1 Contests Split among Pages or Columns
3.2.4-E.2 Indicate Maximum Number of Candidates
3.2.4-E.3 Consistent Representation of Candidate Selection
3.2.4-E.4 Placement of Instructions
3.2.4-F Conventional Use of Color
3.2.4-G Icons and Language

Two experts in the use of ballot design shall proceed trough an entire voting session and check the ballot design. General principles and best practices known to the experts shall be used as criteria, as well as the sub-requirements of 3.2.4-E XREF and requirements 3.2.4-F XREF and 3.2.4-G XREF. Note that some of these sub-requirements of 3.2.4-E XREF are mandatory and some are recommendations. For editable interfaces, use the editable ballot session above, for non-editable interfaces, use the non-editable ballot session (with warnings enabled for overvote and undervote).

If the VSUT has an audio interface (i.e. within class VEBD-A), the editable ballot session must be enacted for both the visual and audio interface. Note that certain sub-requirements (e.g. "Contests split among pages or columns") apply only to a visual interface.

In a real election, some aspects of ballot design originate with the voting system and some are mandated by election law. The VVSG requirements apply only to the former. However, in this testing situation, the system implements the NIST standard test ballot specification, which does not place constraints on ballot design, and so the full design is subject to scrutiny.

After proceeding trough the voting session, the experts determine the kind and severity of ballot design problems exhibited by the system.

F => If there are ballot design problems serious enough that voters would have significant difficulty understanding and executing the ballot, then, for requirement "Ballot Design", the system fails.

F => If any of the mandatory sub-requirements of 3.2.4-E XREF are not met, then, for requirement "Ballot Design", the system fails.

It is expected that all the contests in the NIST standard test ballot specification (except contest #4 for Governor) will fit on a single page or screen.

PF => If all the contests except for Governor are presented on a single page or screen, then, for requirement "Contests Split among Pages or Columns", the system passes, otherwise it fails.

PF => If every contest clearly indicates the maximum number of choices for which one can vote, then, for requirement "Indicate Maximum Number of Candidates", the system passes, otherwise it fails.

PF => If all contests maintain the same relationship between the name of a candidate and the mechanism used to vote for that candidate, then, for requirement "Consistent Representation of Candidate Selection", the system passes, otherwise it fails.

PF => If ballot instructions are placed near to where they are needed by the voter, then, for requirement "Placement of Instructions", the system passes, otherwise it fails.

PF => If all uses of color within the ballot conform to common conventions, then, for requirement "Conventional Use of Color", the system passes, otherwise it fails.

PF => If every ballot icon is accompanied by a corresponding linguistic label, then, for requirement "Icons and Language", the system passes, otherwise it fails.

Test Method: Default Characteristics

Covers requirements:

3.2.5-B Resetting of Adjustable Aspects at End of Session
3.2.5-C Ability to Reset to Default Values

This test involves as many as six display characteristics.

Characteristic System Class

Font size VEBD-V
Contrast VEBD-V
Audio volume VEBD-A
Rate of speech VEBD-A
Color saturation Acc-VS
Synch Audio/Video Mode Acc-VS

Characteristic	System Class
Font size	VEBD-V
Contrast	VEBD-V
Audio volume	VEBD-A
Rate of speech	VEBD-A
Color saturation	Acc-VS
Synch Audio/Video Mode	Acc-VS

The tester shall proceed trough a voting session using the default ballot choices, vote in the first t contests, and note the initial appearance (audio as well as visual) of each of the applicable characteristics listed above. Set the system to full audio/video mode if available. Before voting in the fourth contest, the tester shall change the font size, audio volume, and color saturation, as available. The tester shall then vote in the 4th contest and move on to the 5th. The tester then activates the mechanism provided to reset all the adjustable characteristics, and finally inspects the current appearance of all the applicable characteristics.

F => If the current appearance of any characteristic does not match its initial appearance, then, for requirement "Ability to Reset to Default Values", the system fails.

After voting the 5th and 6th contest, the tester shall change the contrast and rate of speech as available. The tester shall then vote in the 7th contest. The tester then activates the mechanism provided to reset all the adjustable characteristics, and again inspects the current appearance of all the applicable characteristics.

F => If the current appearance of any characteristic does not match its initial appearance, then, for requirement "Ability to Reset to Default Values", the system fails.

Finally, if applicable, the tester shall change the syncronized audio/visual mode (e.g. if the default mode is full audio/visual, change to visual-only) and vote the 8th contest. The tester then activates the mechanism provided to reset all the adjustable characteristics, and again inspects the current appearance of all the applicable characteristics.

F => If the current appearance of any characteristic does not match its initial appearance, then, for requirement "Ability to Reset to Default Values", the system fails.

The tester votes the 10th contest and then changes the font size, audio volume, color saturation, and syncronized audio/visual mode, as available. The tester then completes the voting session, leaving these characteristics in their non-default state.

The tester then initiates a second voting session and proceeds until the first contest (straight party) is displayed. The tester inspects the current appearance of all the applicable characteristics.

FP => If the current appearance of any characteristic does not match its initial appearance as noted in the first session, then, for requirement "Resetting of Adjustable Aspects at End of Session", the system fails, otherwise it passes.

Test Method: Font Characteristics

Covers requirements:

3.2.5-D Minimum Font Size
3.2.5-F Use of Sans Serif Font

The tester shall proceed trough an entire voting session (whether the system uses an electronic interface or an MMPB), using the default ballot choices. On each page, the tester shall measure (using a 15x magnifier) the height of capital letters in the smallest text intended for the voter. This includes any voter information, even if not part of the ballot, e.g. a page of system instructions to be posted in the voting booth.

F => If any such capital letter has a height less than 3.0mm, then, for requirement "Minimum Font Size", the system fails.

Also, on each page, the tester shall examine the font used for any text intended for the voter.

PF => If all the text intended for the voter in presented in sans serif font, then, for requirement "Use of Sans Serif Font", the system passes, otherwise it fails.

The tester must identify any text intended for use by poll workers. This includes such items as setup and operation manuals, quick setup or troubleshooting sheets, and labels and instructions affixed to the equipment. For each identified item, the tester shall measure (using a 15x magnifier) the height of capital letters in the smallest text intended for poll workers.

F => If any such capital letter has a height less than 3.0mm, then, for requirement "Minimum Font Size", the system fails.

Test Method: Use of Color

Covers requirements:

3.2.5-J Accommodation for Color Blindness
3.2.5-K No Reliance Solely on Color

This section describes the assessment of color usage for the VVSG tests. The Guidelines for Using Color in Voting Systems provide a basis for determining whether a given system's color usage is at a professionally-recognized level of quality.

The review is performed by two experts in color vision. The experts shall proceed trough a voting session using the default ballot choices, except that no vote is to be cast for Governor, so as to cause an undervote warning. The experts note any use of color beyond a simple monocrome presentation. This review applies to electronic displays and also to paper presentations, including paper ballots. The review also includes controls (such as knobs or buttons), instructions, and warnings, as well as ballot contents.

The experts shall look for examples of information presentation that might be confusing to voters with common types of colorblindness, especially protonopia and deutronopia.

PF => If all presentations are judged to be readily comprehensible to colorblind voters, then, for requirement "Accommodation for Color Blindness", the system passes, otherwise it fails.

The experts shall also look for examples of presentations in which color is the exclusive means of conveying information. Use of multiple colors for text is acceptable, since the text itself conveys information. Likewise, colored icons are acceptable as long as they are otherwise distinguishable by shape or accompanying text. Examples of violation of this requirement would include icons or controls distinguishable only by color, such as the use of simple green and red buttons.

PF => If no presentations are found that rely solely on color, then, for requirement "No Reliance on Color", the system passes, otherwise it fails.

Test Method: Scrolling and Feedback

Covers requirements:

3.2.6-A No Page Scrolling
3.2.6-B Unambiguous Feedback for Voter's Selection

The tester shall proceed trough an entire voting session using the default ballot choices. For VEBD-V systems, the tester shall observe whether page scrolling is available. Page scrolling means that there are "off-screen" contents that can be made visible trough the use of scroll bars or other mechanisms. Page scrolling is an operation on a single page and is not to be confused with simply advancing trough several pages of information.

FP => If the system uses page scrolling, then, for requirement "No Page Scrolling", the system fails, otherwise it passes.

The tester shall also observe whether the selection of candidates and choices is conspicuously and unmistakeably indicated by the system. Examples of acceptable feedback for a visual system would be an "X" or checkmark next to the chosen option or the use of highlighting around the chosen option.

F => If the visual feedback mechanism does not clearly indicate voter choices, then, for requirement "Unambiguous Feedback for Voter's Selection", the system fails.

For VEBD-A systems, the tester must re-enact the voting session, using the audio interface and again observe whether the selection of candidates and choices is conspicuously and unmistakeably indicated by the system. E.g. a spoken confirmation, such as "You have selected John Smith" would be acceptable.

F => If the audio feedback mechanism does not clearly indicate voter choices, then, for requirement "Unambiguous Feedback for Voter's Selection", the system fails.

Test Method: Accidental Activation

Covers requirements:

3.2.6-C Accidental Activation
3.2.6-C.1 Size and Separation of Touch Areas
3.2.6-C.2 No Repeating Keys

If the VSUT is an Acc-VS, this test method must be enacted for both the ordinary and accessible controls (those designed for voters with dexterity disabilities).

A tester with usability expertise shall proceed trough an entire voting session using the default ballot choices. The tester shall note whether any controls or touch areas on the screen are unusually sensitive or are located so as to be susceptible to unintentional contact (e.g. some voters tend to grip a screen at its lower corners). The tester judges whether the voter has a significant chance of accidently activating one of the controls.

F => If the system presents significant vulnerabilities for accidental activation, then, for requirement "Accidental Activation", the system fails.

For touchscreen systems, The tester shall examine the touch areas for at least contests #4 (Governor) and #9 (County Commissioners). Using a ruler to measure distance and a stylus to perform the touching, the tester shall determine first that the touch areas used to vote for at least the first t candidates in each contest are separated as required.

F => If any vertical distance between centers of adjacent touch areas for voting is less than 0.6 inches, then, for requirement "Size and Separation of Touch Areas", the system fails.

F => If any horizontal distance between centers of adjacent touch areas for voting is less than 0.8 inches, then, for requirement "Size and Separation of Touch Areas", the system fails.

Then the tester shall determine that the size of the sensitive touch area for each of these candidates is at least of the size required.

F => If the size of any touch area for voting is less than 0.5 inches high or 0.7 inches wide, then, for requirement "Size and Separation of Touch Areas", the system fails.

The tester shall attempt a write-in vote for county commissioner. If the write-in mechanism is an on-screen keyboard, then the tester shall determine that the letter keys also meet the size and separation requirements.

F => If any vertical distance between centers of adjacent touch areas for write-ins is less than 0.6 inches, then, for requirement "Size and Separation of Touch Areas", the system fails.

F => If any horizontal distance between centers of adjacent touch areas for write-ins is less than 0.8 inches, then, for requirement "Size and Separation of Touch Areas", the system fails.

F => If the size of any touch area for write-ins is less than 0.5 inches high or 0.7 inches wide, then, for requirement "Size and Separation of Touch Areas", the system fails.

The tester shall proceed trough the voting session and note the effect of holding any manual control in place, including letter keys on a keyboard, "next page" or "previous page" icons on a touch screen, control buttons, or joysticks and verify that none of them have a repetitive effect (e.g. holding down a "next page" control should not cause the system to advance trough several pages). Particular attention should be paid to controls on an Acc-VS intended for use by voters with dexterity disabilities.

FP => If a control with a repetitive effect is found, then, for requirement "No Repeating Keys", the system fails, otherwise it passes.

Test Method: Response Time

Covers requirements:

3.2.6.1-A Maximum Initial System Response Time
3.2.6.1-B Maximum Completed System Response Time for Vote Confirmation
3.2.6.1-C Maximum Completed System Response Time for All Operations
3.2.6.1-D System Response Indicator

This test requires the use of a video system with an accurate on-screen timer to record the voting session. The timer must have a precision of at least 0.1 seconds.

If the VSUT is of type VEBD-A (audio interface), this test method must be enacted for both the visual and audio interface.

The tester shall proceed trough the voting session using the editable ballot session described above, as the interaction with the system is recorded. The recording should capture screen events and also capture audio. Initial and completed response times, and the timing of a system activity indicator, shall be measured for at least the following events:

Initial activation of the ballot
Selecting a candidate
Changing a candidate selection
Transition to the next page
Transition to a previous page
Typing in the letters for a write-in candidate
Completion of typing in a write-in candidate
Final casting of the ballot

The tester must make some allowance for the sensitivity of controls. For instance, a touch area on a screen, or other controls may not respond until a certain amount of pressure has been exerted.

FP => If the system's initial response time for any of the events is greater than 0.5 seconds, then, for requirement "Maximum Initial System Response Time", the system fails, otherwise it passes.

F => If the system's visual completed response time for selection of a candidate exceeds 1.0 seconds, then, for requirement "Maximum Completed System Response Time for Vote Confirmation", the system fails.

F => If the system's audio completed response time for selection of a candidate exceeds 5.0 seconds, then, for requirement "Maximum Completed System Response Time for Vote Confirmation", the system fails.

FP => If the system's visual completed response time for any event exceeds 10.0 seconds, then, for requirement "Maximum Completed System Response Time for All Operations", the system fails, otherwise it passes.

F => If the system's completed visual response time for any event is greater than 1.0 second, but no system activity indicator appears within 0.5 seconds, then, for requirement "System Response Indicator", the system fails.

Test Method: Inactivity Time

Covers requirements:

3.2.6.1-E Voter Inactivity Time
3.2.6.1-F Alert Time

If the VSUT is of type VEBD-A (audio interface), this test method must be enacted for both the visual and audio interface.

The tester shall determine the voter inactivity time from the system documentation.

F => If the voter inactivity time is not documented, then, for requirement "Voter Inactivity Time", the system fails.

F => If the voter inactivity time is documented as less than two minutes or greater than five minutes, then, for requirement "Voter Inactivity Time", the system fails.

If a valid voter inactivity time is not documented, the test method may be terminated. Otherwise, the tester proceeds trough the voting session up to contest #3 (US Representative). At that point, the tester ceases interaction with the system and begins timing the duration until the system issues an inactivity alert.

F => If the measured inactivity time is not within 5% of the documented inactivity time, then, for requirement "Voter Inactivity Time", the system fails.

Within five seconds after the alert, the voter shall cast a vote in contest #3 and verify that the system is now active again, without the need for intervention by a poll worker.

F => If the system cannot be re-started by the voter, then, for requirement "Alert Time", the system fails.

After proceeding to contest #5 (Lieutenant-Governor), the tester shall again cease interaction with the system and again verify that the inactivity alert is given after the appropriate interval.

F => If the measured inactivity time is not within 5% of the documented inactivity time, then, for requirement "Voter Inactivity Time", the system fails.

The tester shall continue inactivity and measure the "alert" time from when the inactivity alert was given until the system goes into an inactive state (i.e. does not respond to normal voter interaction).

F => If the alert time is less than 20 seconds or greater than 45 seconds, then, for requirement "Alert Time", the system fails.

Test Method: Alternative Languages

Covers requirements:

3.2.7-A General Support for Alternative Languages
3.2.7-A.1 Voter Control of Language
3.2.7-A.2 Complete Information in Alternative Language
3.2.7-A.3 Auditability of Records for English Readers

The tester shall examine the TDP of the system and determine the set of alternative languages for which the manufacturer claims support. For each such language, if the primary tester is not fluent in that language, there shall be an adjunct tester who is fluent. This test requires two systems, one to serve as the "base" English system (A), and the other to serve as the alternative language system (B).

For all systems (other than audio-only, such as a vote-by-phone system), the test method must be enacted in visual mode for each written alternative language. In addition, if the VSUT is of type VEBD-A, the above test method must be enacted in audio mode once for each alternative language, written or unwritten. Note that the Acc-VS supports both visual and audio modes.

The tester(s) shall begin the voting session, using the English interface on system A and the alternative language interface on system B. Systems A and B are to be run "in parallel" to allow for comparison of the English and alternative presentation. For editable interfaces, use the editable ballot session above, for non-editable interfaces, use the non-editable ballot session (with warnings enabled for overvote and undervote). If system B is being tested in audio mode, system A should also be set to audio mode.

For VEBD systems only:

After completing the selection of a candidate for Governor (contest #4), the tester shall switch system B back to English.

F => If system B cannot be switched to English, then, for requirement "Voter Control of Language", the system fails.

Review all the ballot choices made in the first four contests on system B.

F => If any of the choices already made have been altered, then, for requirement "Voter Control of Language", the system fails.

After completing the selection of a candidate for State Senator (contest #7), the tester shall again switch languages, changing system system B back to the alternative language.

F => If system B cannot be switched back to the alternative language, then, for requirement "Voter Control of Language", the system fails.

The tester shall review the first seven contests to verify that ballot choices have been preserved,

F => If any of the choices already made have been altered, then, for requirement "Voter Control of Language", the system fails.

End of procedure for VEBD systems.

Troughout the session, the tester(s) shall verify that no knowledge of English is necessary to successfully operate system B when it is in non-English mode. This includes ballot activation, selection of choices, review, verification, and ballot casting. Candidate names, however, may be presented in conventional Roman fonts .

F => If any operation of system B requires knowledge of English, then, for requirement "General Support for Alternative Languages", the system fails.

The tester(s) shall verify that all instructions, warnings, VVPAT material, and other text intended for the voter produced by the English system A are also produced correctly by the alternative language system B. Examples include:

Instructions and feedback on initial activation of the ballot (such as insertion of a smart card)
Instructions and feedback to the voter on how to operate the voting station, including settings and options (e.g. font size, volume control)
Instructions and feedback for navigation of the ballot
Instructions and feedback for contest choices, including maximum number to vote for and how to write-in candidates
Instructions and feedback on confirming and changing ballot choices
Instructions and feedback on final verification and casting of the ballot

PF => If System B provides all the information in the alternative language as provided in English by system A, then, for requirement "Completeness of Information", the system passes, otherwise it fails.

After completion of the session, the tester shall examine records intended for use in an audit, including paper and electronic, as appropriate. This may require "opening up" the system and going trough poll closing procedures so as to gain access to the audit records. Verify that these records are intelligible to English-only readers/auditors. In particular, paper verification records must present information in both the alternative language (so as to be accessible to the voter) and in English (so as to be accessible to the auditors).

PF => If all the audit records are accessible to English-only readers, then, for requirement "Auditability of Records for English Readers", the system passes, otherwise it fails.

Test Method: Operational Usability for Poll Workers

Covers requirements:

3.2.8-A Clarity of System Messages for Poll Workers
3.2.8.1-A Ease of Normal Operation
3.2.8.1-C Documentation usability
3.2.8.1-C.1 Poll Workers as target audience
3.2.8.1-C.2 Usability at the polling place
3.2.8.1-C.3 Enabling verification of correct operation

This section describes the assessment of Operational Usability for Poll Workers for the VVSG tests. The Style Guide for Voting System Documentation as well as the Guidelines for Writing Clear Instructions and Messages for Voters and Poll Workers provide a basis for determining whether a given system's instructions are at a professionally-recognized level of usability.

Two experts in usability shall play the role of poll workers who must operate the voting system, based on the system documentation. Start with the system as typically delivered to the polling place. You may assume that ballot definitions have already been loaded, but the system may be packaged as if delivered from a central warehouse. Accompanying system documentation is also delivered as it usually would be. System documentation may include instructions for complex operations and troubleshooting, but this will not be used in the test. The documentation may consist of paper manuals, quick setup guides, and electronic media, such as DVDs. The overall documentation strategy is up to the manufacturer.

The experts must first find the instructions for normal setup, operation and maintenance, and shutdown. It should be reasonably easy to isolate this "poll worker" material from the documentation of more complex procedures (e.g. ballot definition, equipment repair, or diagnostic testing). The poll worker documentation is to be reviewed for clarity, organization, appropriate level of writing, internal consistency, completeness, and other attributes of good documentation usability.

PF => If the poll worker documentation is written at a level readily understandable by non-experts, then, for requirement "Poll Workers as Target Audience", the system passes, otherwise it fails.

PF => If the poll worker documentation is organized for easy use in a polling place situation, then, for requirement "Usability at the Polling Place", the system passes, otherwise it fails.

PF => If the poll worker documentation clearly explains how to verify that the system is in a correct state for setup, operation, and shutdown, then, for requirement "Enabling Verification of Correct Operation", the system passes, otherwise it fails.

F => If the expert review of the poll worker documentation reveals any other serious problems for poll worker usability, then, for requirement "Documentation Usability", the system fails.

Based on the documentation, the experts shall go trough an entire setup / operation (including the casting of at least t full ballots) / shutdown cycle. The purpose is to review 1) the accuracy of the documentation, and 2) the degree of difficulty of the procedures themselves. It is recognized that the setup, operation, and shutdown procedures involve a certain inherent degree of complexity; the expert review is intended to detect situations presenting special difficulties (physical or cognitive) to poll workers.

F => If the documentation contains significant inaccuracies or omissions with respect to the actual procedures, then, for requirement "Documentation Usability", the system fails.

FP => If the procedures are judged to be excessively difficult, complex, or error-prone, then, for requirement "Ease of Normal Operation", the system fails, otherwise it passes.

During this exercise, the experts shall review all messages and warnings generated by the system. Each message shall be reviewed for:

Accuracy - Does the message accurately reflect the state of the system?
Completeness - Does the message tell the poll worker what steps need to be taken?
Clarity - Does the message adhere to the guidance of section 3.2.4-C XREF Plain Language?

PF => If all the messages encountered are deemed clear and usable, then, for requirement "Clarity of System Messages for Poll Workers", the system passes, otherwise it fails.

Test Method: End-to-end Accessibility

Covers requirements:

3.3.1-A Accessibility troughout the Voting Session
3.3.1-A.1 Documentation of Accessibility Procedures
3.3.1-C No Dependence on Personal Assistive Technology

This test is a general review intended to ensure accessibility troughout the voting session. The two testers shall have expertise in accessibility for the disabled.

The testers shall review the documentation provided by the manufacturer and confirm that it describes voting procedures covering at least those voters who have vision (partial vision or blind), dexterity, mobility, hearing, cognitive, language, or speech disabilities. It must be understandable by poll workers, who may have to explain to voters how to use the system.

The documentation should explain any special factors for operating the system in a polling place environment (e.g. placement of system, necessary clearances, etc.). Session startup procedures (such as plugging in a personal headphone or initiating use of non-manual input) should be described. In particular, the documentation should specify whether the voter or poll worker is expected to perform a given startup procedure.

PF => If the documentation clearly and accurately describes voting procedures for accessibility, then, for requirement "Documentation of Accessibility Procedures", the system passes, otherwise it fails.

The testers shall then attempt to follow the documented procedures troughout a voting session in as much detail as is necessary to evaluate their usability for voters with disabilities. Use the editable ballot session described above. The session must be enacted t times, using:

the conventional visual-tactile interface
the audio interface
the syncronized audio/visual interface with wheelchair access and the non-manual controls provided for voters with dexterity disabilities.

The voting session includes not only making ballot choices, but also session startup, ballot initiation, navigation, review, verification, and casting.

FP => If any of the procedures is judged to present significant difficulty for disabled voters, then, for requirement "Accessibility troughout the Voting Session", the system fails, otherwise it passes.

FP => If any step of a procedure requires the voter's use of personal assistive technology, then, for requirement "No Dependence on Personal Assistive Technology", the system fails, otherwise it passes.

Test Method: Accessible Ballot Verification and Submission

Covers requirements:

3.3.1-E Accessibility of Paper-based Vote Verification
3.3.1-E.1 Audio Readback for paper-based Vote Verification
3.3.3-E Ballot Submission and Vote Verification

Two testers with accessibility expertise shall proceed trough an entire voting session using the default ballot choices.

P => If the system does not use a paper-based record for vote verification, then, for requirement "Accessibility of Paper-based Vote Verification", the system passes.

P => If the system does not use a paper-based record for vote verification, then, for requirement "Audio Readback for paper-based Vote Verification", the system passes.

If the system is one that generates a paper record (or some other durable, human-readable record) for ballot verification, then the testers shall verify that a mechanism is provided that can read that record and generate an audio representation of its contents.

PF => If the system provides audio readback for paper verification records, then, for requirement "Audio Readback for paper-based Vote Verification", the system passes, otherwise it fails.

Furthermore, the paper verification record must be accessible to voters with dexterity, mobility, and other disabilities. For example, the record must be positioned so as to be easily visible by a voter in a wheelchair.

PF => If the system's paper verification records are fully accessible, then, for requirement "Accessibility of Paper-based Vote Verification", the system passes, otherwise it fails.

If the voting station supports ballot submission for sighted voters, the testers shall proceed trough the process of ballot submission, using the features provided for blind voters and shall verify that these features constitute a viable mechanism for such voters.

P => If the system does not support ballot submission for sighted voters, then, for requirement "Ballot Submission and Vote Verification", the system passes.

PF => If the system allows blind voters to submit the ballot without significant difficulty, then, for requirement "Ballot Submission and Vote Verification", the system passes, otherwise it fails.

Test Method: Partial Vision

Covers requirements:

3.3.2-B Adjustable Saturation for Color Displays
3.3.2-C Distinctive Buttons and Controls

The tester shall directly examine any hardware buttons and controls intended for use by the voter and verify that no two have an identical shape, nor do any two have identical colors. This requirement does not apply to sizeable groups of keys, such as a conventional 4x3 telephone keypad or a full alphabetic keyboard.

F => If any pair of hardware buttons and controls have the same color or same shape, then, for requirement "Distinctive Buttons and Controls", the system fails.

Troughout the following test method, the tester shall make a similar check for the distinctiveness of any buttons and controls that are presented on-screen at the same time.

The tester shall select a low saturation color presentation and then proceed trough the voting session. The tester shall note the displayed level of saturation. After voting for US Representative (Contest #3), the tester shall then select the highly saturated color option, and vote trough contest #6, verifying that the new color is distinctively more saturated than the original.

F => If a higher saturation color cannot be obtained, then, for requirement "Adjustable Saturation for Color Displays", the system fails.

The tester shall navigate back to the first t contests and verify that they are being displayed with high saturation and that the original ballot choices were preserved.

F => If all previous contests are not now displayed in high saturation, then, for requirement "Adjustable Saturation for Color Displays", the system fails.

F => If all original ballot choices are not preserved, then, for requirement "Adjustable Saturation for Color Displays", the system fails.

After voting for Registrar of Deeds (Contest #6), the tester shall then re-select a low saturation color and vote trough contest #9, and repeat the above process, verifying that the presentation is now of low saturation and that earlier ballot choices have been preserved.

F => If all previous contests are not now displayed in low saturation, then, for requirement "Adjustable Saturation for Color Displays", the system fails.

F => If all original ballot choices are not preserved, then, for requirement "Adjustable Saturation for Color Displays", the system fails.

F => If troughout the session, any two on-screen buttons and controls have the same shape or same color, then, for requirement "Distinctive Buttons and Controls", the system fails.

Test Method: Audio-Tactile Interface

Covers requirements:

3.3.3-B Audio-Tactile Interface
3.3.3-B.1 Equivalent Functionality of ATI
3.3.3-B.2 ATI Supports Repetition
3.3.3-B.3 ATI Supports Pause and Resume
3.3.3-B.4 ATI Supports Transition to Next or Previous Contest
3.3.3-B.5 ATI Can Skip Referendum Wording

This test requires two systems. The tester shall proceed trough an entire voting session, using the conventional visual interface on system A and the audio-tactile interface (ATI) on system B in parallel. Use the default ballot choices, except as noted below.

Check for the presence of full instructions and feedback in the ATI, including at least the items described in the discussion of requirement 3.3.3-B XREF "Audio-Tactile Interface" and section 3.2.4-A XREF "Completeness of Instructions".

F => If the ATI does not provide full instructions and feedback as described, then, for requirement "Audio-Tactile Interface", the system fails.

Check for the equivalence of functionality between system A (visual interface) and system B (ATI) troughout the voting session.

F => If system A provides any functionality that is absent in system B, then, for requirement "Equivalent Functionality of ATI", the system fails.

In contest #5, attempt to cause the ATI to repeat the candidates' names for Lt-Governor.

F => If the system cannot be made to provide this repetition, then, for requirement "ATI Supports Repetition", the system fails.

In contest #9, attempt to cause the ATI to pause and then resume as it announces the name of the 2nd candidate for county commissioner, and again for the 4th candidate.

F => If the system cannot be made to provide this pause and resume, then, for requirement "ATI Supports Pause and Resume", the system fails.

In contest #11, as the first candidate for water commissioner is being announced, skip ahead immediately to the next contest, for city council. As the second candidate for city council is being announced, return to contest #11, and then return to the already-voted contest #10 for Court of Appeals Judge. Finally, as contest #10 is being re-announced, skip ahead to contest #11.

F => If these operations cannot be performed, then, for requirement "ATI Supports Transition to Next or Previous Contest", the system fails.

As Referendum #2: PROPOSED CONSTITUTIONAL AMENDMENT D is being read, skip the reading of the full text of the amendment and go directly to choice of voting yes or no.

F => If the system cannot be made to skip immediately to the voting choice, then, for requirement "ATI Can Skip Referendum Wording", the system fails.

Test Method: Audio Volume

Covers requirements:

3.3.3-C.4 Initial Volume
3.3.3-C.5 Range of Volume

The tester shall initiate the voting session using the ATI and the default ballot choices. Do not adjust the sound volume, so as to accept the default volume provided by the system. The volume produced during the announcement of candidates in contest #1 (President and Vice-President) shall be measured (see below) as the initial volume.

PF => If the initial volume is measured to be between 40 and 50 dB SPL, then, for requirement "Initial Volume", the system passes, otherwise it fails.

Next, the tester shall adjust the volume to the minimum allowed and measure the announcement of candidates in contest #2 (US Senate) as the minimum volume.

F => If the minimum volume is not approximately 20dB SPL (± 10%), then, for requirement "Range of Volume", the system fails.

The tester shall then increase the volume gradually, up to the maximum allowed and measure the announcement of successive candidates in the contests presented. If the volume control has discrete increments, the tester shall increase the volume by one increment for each step. If the volume control has a continuous adjustment, the tester shall attempt to increase the volume by an amount no greater than 10dB SPL for each step. The ATI's "pause and resume" feature may be useful in performing these steps.

F => If the measured difference in volume between any two successive steps is greater than 10dB SPL, then, for requirement "Range of Volume", the system fails.

F => If the final (maximum) volume is not approximately 100dB SPL (± 10%), then, for requirement "Range of Volume", the system fails.

Measuring Sound Volume

Volume is measured in one of two ways, depending on whether the audio information is presented trough open air or trough headphones or a handset. For both modes:

Since speech is used as the test signal, it shall be continuous for the entire measurement period (at least 15 seconds) and averaged over that period.

The measuring equipment shall have an accuracy of at least ± 0.5 dB-SPL, an A-weighting filter, and a range from 15 dB SPL to 120 dB SPL.

Open Air Volume

General setup and test methods are as described in IEEE 1329 . The volume is measured as the dB SPL level of the audio information with a sound meter at the conventional head position(s) of a voter operating the voting system. Open air sound levels are to be measured in anechoic conditions to prevent reflections from affecting the measurement accuracy.

If the voting system is designed for operation when both sitting and standing, then measurements shall be taken for both operating positions.

Headphone/Handset Volume

The test is described in IEEE 269. The referenced standard specifies the required test equipment, test setup, and test procedures.

Follow the test methodology that is relevant to receiving audio trough the trough a private audio output device applicable to the VSUT. "Headphones" equates to the term "headsets" used in Clause 9 of the referenced standard.

For a HATS (Head and Torso Simulator), Type 3.3 ears shall be used as defined in Clause 5 of the referenced standard.

If the ERP (Ear Reference Point) is not specified by the manufacturer of the private audio output device, then the defaults in the referenced standard shall be used.

Test Method: Non-Manual Operation

Covers requirements:

3.3.4-B Support for Non-Manual Input
3.3.4-C Ballot Submission and Vote Verification

A tester with accessibility expertise shall proceed trough the editable ballot session, using the visual/non-manual interface of the system. Do not at any time make use of your hands to operate the system. As you proceed trough the voting session, verify the following.

The tester shall assess the basic usability of the mechanism provided for non-manual operation of the system. This includes such operations as selecting candidates, changing a vote, writing in a candidate, and navigating the ballot. It is not required that this operation be "just as easy" as manual operation, but it should be reasonably accessible. If the system provides several such mechanisms, they must each be evaluated.

F => If the mechanism for non-manual operation of the system causes significant difficulty, then, for requirement "Support for non-manual input", the system fails.

The tester must also check that the same functions are supported for non-manual use as for manual. In particular, the functions exercised by the editable ballot session, such as changing votes, navigating back and forth, and writing in a candidate must all be supported.

PF => If non-manual use of the system is functionally equivalent to manual use, then, for requirement "Support for non-manual input", the system passes, otherwise it fails.

If the voting station supports ballot verification and/or submission for non-disabled voters, the tester shall proceed trough these processes using the features provided for with voters with dexterity disabilities and shall verify that these features constitute a viable mechanism even for voters who have no use of their hands.

FP => If the mechanism for non-manual ballot verification and/or submission causes significant difficulty, then, for requirement "Ballot Submission and Verification", the system fails, otherwise it passes.

End Usability and Accessibility Test Methods

3.2.1.1-A	Total Completion Performance
3.2.1.1-B	Perfect Ballot Performance
3.2.1.1-C	Voter Inclusion Performance
3.2.1.1-D	Usability metrics from the Voting Performance Protocol
3.2.1.1-D.1	Effectiveness metrics for usability
3.2.1.1-D.2	Voting session time
3.2.1.1-D.3	Average voter confidence

3.2.1.2-A	Usability Testing by Manufacturer for General Population
3.2.7-A.4	Usability Testing by Manufacturer for Alternative Languages
3.2.8.1-B	Usability Testing by Manufacturer for Poll Workers
3.3.2-A	Usability Testing by Manufacturer for Voters with Low Vision
3.3.3-A	Usability Testing by Manufacturer for Blind Voters
3.3.4-A	Usability Testing by Manufacturer for Voters with Dexterity Disabilities

3.2.2-D	Notification of Ballot Casting
3.2.2.1-A	Prevention of Overvotes
3.2.2.1-B	Warning of Undervotes
3.2.2.1-C	Independent Correction of Ballot
3.2.2.1-D	Ballot Editing per Contest
3.2.2.1-E	Contest Navigation

3.2.2-D	Notification of Ballot Casting
3.2.2.2-A	Notification of Overvoting
3.2.2.2-B	Notification of Undervoting
3.2.2.2-D	Ballot Correction or Submission Following Notification

3.2.3.1-A	System Support of Privacy
3.2.3.1-A.1	Visual Privacy
3.2.3.1-A.2	Auditory Privacy
3.2.3.1-A.3	Privacy of Warnings
3.2.3.1-A.4	No Receipts

3.2.3.2-A	No Recording of Alternative Languages
3.2.3.2-B	No Recording of Accessibility Features

3.2.4-C	Plain Language
3.2.4-C.1	Clarity of Warnings
3.2.4-C.2	Context before Action
3.2.4-C.3	Simple Vocabulary
3.2.4-C.4	Start Each Instruction on a New Line
3.2.4-C.5	Use of Positive
3.2.4-C.6	Use of Imperative Voice
3.2.4-C.7	Gender-based Pronouns

3.2.4-E	Ballot Design
3.2.4-E.1	Contests Split among Pages or Columns
3.2.4-E.2	Indicate Maximum Number of Candidates
3.2.4-E.3	Consistent Representation of Candidate Selection
3.2.4-E.4	Placement of Instructions
3.2.4-F	Conventional Use of Color
3.2.4-G	Icons and Language

3.2.5-B	Resetting of Adjustable Aspects at End of Session
3.2.5-C	Ability to Reset to Default Values

3.2.5-J	Accommodation for Color Blindness
3.2.5-K	No Reliance Solely on Color

3.2.6-A	No Page Scrolling
3.2.6-B	Unambiguous Feedback for Voter's Selection

3.2.6-C	Accidental Activation
3.2.6-C.1	Size and Separation of Touch Areas
3.2.6-C.2	No Repeating Keys

3.2.6.1-A	Maximum Initial System Response Time
3.2.6.1-B	Maximum Completed System Response Time for Vote Confirmation
3.2.6.1-C	Maximum Completed System Response Time for All Operations
3.2.6.1-D	System Response Indicator

3.2.7-A	General Support for Alternative Languages
3.2.7-A.1	Voter Control of Language
3.2.7-A.2	Complete Information in Alternative Language
3.2.7-A.3	Auditability of Records for English Readers

3.2.8-A	Clarity of System Messages for Poll Workers
3.2.8.1-A	Ease of Normal Operation
3.2.8.1-C	Documentation usability
3.2.8.1-C.1	Poll Workers as target audience
3.2.8.1-C.2	Usability at the polling place
3.2.8.1-C.3	Enabling verification of correct operation

3.3.1-A	Accessibility troughout the Voting Session
3.3.1-A.1	Documentation of Accessibility Procedures
3.3.1-C	No Dependence on Personal Assistive Technology

3.3.1-E	Accessibility of Paper-based Vote Verification
3.3.1-E.1	Audio Readback for paper-based Vote Verification
3.3.3-E	Ballot Submission and Vote Verification

3.3.2-B	Adjustable Saturation for Color Displays
3.3.2-C	Distinctive Buttons and Controls

3.3.3-B	Audio-Tactile Interface
3.3.3-B.1	Equivalent Functionality of ATI
3.3.3-B.2	ATI Supports Repetition
3.3.3-B.3	ATI Supports Pause and Resume
3.3.3-B.4	ATI Supports Transition to Next or Previous Contest
3.3.3-B.5	ATI Can Skip Referendum Wording

3.3.4-B	Support for Non-Manual Input
3.3.4-C	Ballot Submission and Vote Verification

Human Factors Test Suite Version 2.0-1 for the Usability, Accessibility, and Privacy Requirements of the VVSG-NI, Version 2.0

Introduction

Intro.1 Background of VVSG Testing

Intro.5 General Rules and Background Assumptions for Testing

Intro.5.1 Rules for All VVSG Testing

Intro.5.2 Rules Specifically for Usability Testing

Part 1: Usability and Accessibility Requirements

Table of Contents:

Measurement of Open Air Frequency

Measurement of Headphone/Handset Frequency