CRT Teleconference - October 12, 2006

CRT Teleconference
Thursday, October 12, 2006

Participants: Alan Goldfine, Allan Eustis, David Flater, Max E., Nelson Hastings, Paul Miller, Philip Pearce, Sharon Laskowski, Wendy Havens

Draft Agenda:

1) Administrative updates (Allan E.)

2) Rescheduling the November 2 CRT phone meeting to October 26.

3) Discussion of revised "On Accuracy Benchmarks, Metrics, and Test Methods" (David F.). Please read:
http://vote.nist.gov/TGDC/crt/CRT-WorkingDraft-20061003/AccuracyWriteUp.html

4) Discussion of "Voting Machines: Reliability Requirements, Metrics, and Certification" (Max E.). Please read:
http://vote.nist.gov/TGDC/Reliability_Reqs_Metrics_Certification.doc
NOTE: Max's PowerPoint presentation is attached to this email.

5) Discussion of the remaining issues from "Issues List" (David F.). Please read:
http://vote.nist.gov/TGDC/crt/CRT-WorkingDraft-20061003/Issues.html

6) Any other items.

Administrative Updates

Allan welcomed Philip Pearce as an official member of the TGDC. Also Paul Miller, as of today October 12, 2006 is now an official member.
New members will be getting an orientation package very soon. As soon as the fourth new member joins (hopefully within the next couple of weeks) we will have an orientation teleconference with EAC Commissioner Davidson.
Alan Goldfine discussed rescheduling CRT meetings, moving the meeting on Nov. 2 to Oct. 26 at 11:00 a.m. Nov. 16 & 30 meetings also moved to 11:00 a.m.

On Accuracy Benchmarks, Metrics, and Test Methods - David Flater

This came about because there was an issue highlighted in the draft about whether we want to use a single high-level end to end error rate for the system or do we want to retain the individual error rates that were specified for each low level operation that were in the previous editions of the standard.

In accuracy assessment, no value in having low level error rate - found other issues:

Low-level versus single end-to-end error rate - when done as full analysis, we get predictions of end-to-end error rate. Context of doing this in a test lab is too narrow. No value. Error rates are not observable in a system level test.
Other versions used a probability ratio sequential test for the design of accuracy assessment. Assumes you're doing a single test. More valuable to collect data through entire campaign.
Fixed length versus sequential test plan - when enough evidence is collected to verify system doesn't meet accuracy benchmark, you can terminate testing. We may want to run the entire test campaign for other reasons.
Validity as a system test - The accuracy testing specified in VVSG 05 allowed the test lab to bypass portions of the system that would be exercised during an actual election. Issues such as those reported in CA volume testing, and the cost issue. David position is we have to do end-to-end testing.

Definition of accuracy method.

The accuracy metric in the 2002 VSS and VVSG'05 is ambiguous. Need to clarify.
Definition of "volume" i.e. filled in oval. 1990 VSS Votes versus ballots. Define volume as votes and not detection of marks on a ballot.
The Bernoulli process assumed by the 2002 VSS and VVSG'05 is an invalid model of tabulation. The system can do worse than miscount, it can manufacture votes. The Poisson method is a more valid model, allowing for the possibility of more than one error per unit of volume.
In the determination of error, it is unclear how inaccuracies in ballot counts and totals of under votes and over votes factor in.
All changes have ramifications for what kind of benchmark make sense. Old standard specified 1 error in the volume of 10M. In testing, only 1 error in 500K.
We may want to modify the benchmark to what can be practically demonstrated in testing.

Draft Requirements

Is the new approach OK overall?
Is the 1 in 10M benchmark still appropriate? If not, what should it be?
Is the 90 % confidence level appropriate? If not, what should it be?
Should the test plan be fixed length, or should we stop as soon as there is sufficient evidence that the accuracy benchmark is not satisfied?

Discussion

Allan - Is this dependent on the acceptance by TGDC of the volume test program?
David F - Some changes move us closer to a "California" test method - an end-to-end system test. Only requirement is to mitigate sufficient volume to change risks.
Alan G - One reason why California is a disadvantage, it decreases reproducibility of our (volume) tests.
Paul Miller - Saw the volume test in San Diego - concerns about interface user interaction to the screen freezing and jams with the VVPAT systems. Are these considered errors and how they can be fixed. (David: These are operational (reliability errors).

Voting Machines: Reliability Requirements, Metrics and Certification - Max Etschmaier

Voting system functions: Look at the whole thing - but then look at individual details.
In my last presentation I laid out my general philosophy. Hopefully you've looked at report. The section is limited in its scope. Documented a lot of analysis. This is the first phase of reliability requirements. Transcends into other subjects.
Example of a voting system. Options generation: sets up voting machine for election, separate from machine, but inputs data so security issues. Control program: core of voting machine, model is invariant, does not change. I/O Device: self contained device, if fails, no critical effect on other systems. Verification unit: verifies machine working properly before, during and after election. Machine core: physical structure, holds all other components together. Alternate record: depends on legal requirements as to whether it's needed.
Discussed critical failures and usage pattern at last meeting.
With all we have, we can do a functional failure analysis.
Procedure: For every critical function - identify design requirements to avoid it, if none found, set limit on failure probability. For every non-critical function - determine failure probability, set limit if possible.
Design requirements from analysis. Do we look at the machine as a whole or do we want to break it down and understand the pieces to make an analysis - more meaningful statements. Architecture needs to be transparent - requires separation of code and data. Components need to be separate between input and output devices. Fail safe architecture where possible. No modification after certification. [Discussion: Alan G - you're explicitly forbidding scenario where any changes can be made by vendor before machine is used. David F - EAC posted a firm statement that says if changes to machines have not been reviewed, the machine will not be certified. Legally it is the responsibility of the state to make sure machines are certified. Paul M - Any modifications that do not affect machine workings, will be considered certified - vendor and county (now state) decision.]
Reliability requirements. Critical and non-critical components.
Results are a voting machine that provides security repository of the original vote, feeds into precinct and regional systems, recount (through verification unit) available anytime, and very few failures - any precinct would be able to make due with one spare voting machine. With this rate of failure, strategy # 2 discussed before not necessary.
A prototype should be produced.
Two parallel paths increase failure probability and decrease reliability.
Certification requires very careful analysis. Vendor is responsible for compliance. Volume testing as validation, and start of in-status ongoing performance monitoring. No modification after certification.
Submission for certification. Allan - What's different from the current EAC requirements? David F - the requirements stating certification of component reliability and enforceable assurance of technical and financial fitness.
Next steps. Find a path to implementation: 1) Define requirements for quality and configuration management, 2) transfer results to other parts of the VVSG, 3) examine transition for jurisdictions, and 4) examine transition for vendors. Formulate reliability requirements for VVSG: 1) specify format and 2) set limits on probabilities. [Decisions for TGDC as a hold]
Currently working on the quality assurance and configuration management which was handled completely different in VVSG 05 and previous versions. Max will be delivering a report on this by the December meeting.
This is a framework. Decision needs to be made on how we want to go.
Discussion will continue at the next CRT meeting.

Next CRT meeting will be October 26 at 11:00 a.m.

Teleconferences from 2004, 2005, 2006 and upcoming in 2006.

**************

Link to NIST HAVA Page

Last updated: July 25, 2007
Point of Contact

Privacy policy / security notice / accessibility statement
Disclaimer / FOIA
NIST is an agency of the U.S. Commerce Department