Friday, September 8, 2006 at 2:00 p.m. ET
0. TGDC and EAC
updates from Allan, John Wack
1. Status of usability testing research, Sharon
2. Discussion of VVSG version 2 draft:
- general structure
- Beginning of document through
220.127.116.11 vendor testing
3. Other progress and items, Sharon
Allan Eustis, John Cugini, Philip Pearce, David Baquis, Alexis Scott-Morrison,
Whitney Quesenbery, John Gale
- John Wack on leave
- AE: A number of
us will be participating in MD and DC primary elections as judges. We've
done some outreach visiting and observing election in WY and we're going
to Seattle on the 19th to see their primaries to do the L&As before
as well as the post election canvassing events to learn about other
- AE: We are continuing
to communicate with the EAC to speed up process with getting Philip
and Tricia on board as well as nominees from ANSI and National Association
of State Election Directors.
- All pre-testing
approvals for usability research in place. Bill Killam is starting usability
testing. The contract with Jenny Redish has been approved. We've identified
issues that will slow down our process if not done and we've identified
machines they can use. On the machines we have, Allan has identified
the ballot language which can be changed.
- We have a draft
human subject's study and paperwork reduction act questionnaire package
for submission. Sharon will review draft this weekend.
- Whitney: Is there
a target date for the first preliminary report (that Bill is doing research
on)? A couple of weeks to run the testing - initial results within 4
to 6 weeks. It would be nice to have something to start the discussion
process by the first of December.
- Sharon: Contacted
by Commission on Law and Aging from the American Bar Association in
DC. They're running a symposium on "Facilitating Voting as People
Age - Implication of Cognizant Impairment" in March. While they're
looking at a lot of legal and policy issues in this workshop, they needed
some input about the technology so Sharon's 15 minute overview was useful
for them. They are commissioning some papers on different policy issues
and Sharon sent her paper with Jenny. Sharon will be participating in
Review of Report
- Whitney: General
Comments. Most of the time the order was quite good within sections.
When it doesn't affect the flow in other ways, put the "shalls"
in front of the "shoulds". If reading this for the first time,
would you get a good idea of what was expected, and my answer was yes.
Read for ambiguity and simplicity. Looks like there are more comments
than there actually are.
- John: A little
bit of a structuring issue. Now there are 3 subsections. Sharon suggests
keeping annotations on paper in case the EAC wants to know why it was
structured this way. 3.1 is pure prose, no hard requirements, just explaining
what we mean by usability and accessibility and general principles.
- Whitney: The paragraph
about familiarity - what we hear is all about making sure people know
when change happens. John put it in because it is not something we do
every day. We might want to reword it to say "to gain deeper expertise."
Maybe wording such as "It needs to be self-teaching and walk you
through the process."
- Whitney: Stumbling
over the language regarding EVID. She doesn't have a good suggestion
to reword. It's in regards to the terms used - electronics and editable.
Alexis has the same issues. She is working on changing it, because it's
not intuitive. Perhaps the use of "manual" versus "editable".
- John: Section
3.3. Usability requirements and accessible voting stations and how they
interact. Do we need to say more or less? Works for Whitney and Alexis.
- John: 3.2 is general
usability requirements as opposed to accessibility so they apply to
general machines. This is general overview. Carried over from VVSG 05,
quoting HAVA about requiring us to do this. Sharon says that people
are reading the HAVA quote as the first requirement. We don't want people
to think it's one of our requirements, something has to be done to point
this out - a clarifying note or a figure. Our requirements are voluntary.
we put a statement in there that says "All of our requirements
are essentially an attempt to provide detailed requirements to meet
the mandate of HAVA." EAC will weigh in on our text. We can't contradict
- John: Performance
Requirements. Paragraph explaining that we're applying general usability
principles to voting, defined as usability satisfaction. Not much sense
in word- smithing it at this point.
into the performance requirements themselves. Maybe we shouldn't look
at this until we get Killam's research. Maybe some things, but anything
with a rating of XXX, we should hold off on. Question about the level
of abstraction with which the requirement refers to the test. Another
way of doing this is saying that it must meet the NIST test protocol,
whatever it is, or we could put more of the testing information in here.
Are you comfortable with the level of abstraction?
There's a huge debate on whether we should be looking at these tasks
at all, and discussion about how much we can break it down. Maybe it
should say you need to submit a report. Question about how much should
be in the requirement? Are these the flavor of what we're looking for?
Whitney thinks we're not going to get statistical validity on individual
task analysis, we're looking for an overall assessment across the board.
The detailed tasks are so we can make sure that the test participants
have actually exercised the system and that the overall performance
of the system meets some baseline. This stuff should not go in the requirements.
We should weigh things so more frequent tasks are weighted heavier in
the final score. The vendor may want a more detailed report. Breaking
it down to a lower level, not that we wouldn't get the data, but that's
not really the question, but over the population, how well on average
will they do. We might want to break out some very specific things like
write-ins or straight-party voting. This would be for the jurisdiction
to see if they had. The public report is not where this should go.
Is there a passing
score in every sub-test? Each section must receive a certain score,
with the total equaling combined scores. Can you be really bad at one
test, and make it up by doing really good in another? NIST should be
able to take the data out of the pilot test and run it a couple different
ways. Are we in a situation where we're all over the map and a slight
change in the metric would push the possibility of passing one way or
the other? Are we in that situation? Or are there machines that are
clearly good and those that are not? One of the questions we must ask
ourselves is "are we trying to say this is good enough? OR are
we trying to set a gold standard that machines should aspire?"
Conformance testing sets a low threshold. We're going to set the norm
on where we are. Having a usability test will make sure we don't have
any horrible systems, and give the machines that are trying to be better
have something to strive for.
- John Gale: In
judging the usability performance & thinking of write-ins I'm not
sure what percentage of write-ins you can average from state to state
but it's a low percentage, and it seems that usability should rank as
high for the 99% of people that have committed to a candidate.
Whitney: In NJ write-ins
are a critical factor. They chose there systems to make sure write-ins
were correct. We need to look at the metrics and something used infrequently
and make sure that having a low score on that can't tip the balance.
Evaluation of specific tasks, whether or not it gets reported as pass
or fail, but it will get reported.
- Whitney: Security.
The paper that John circulated on Security (Software Independence) was
quite interesting. It would be good for HFP to read. (Attached Below)
AE: Ron Rivest is
going to forward to TGDC for comments.
- John: Vendor
Testing Section. Substantively unchanged. Vendors must conduct usability
tests and report them, we are not too specific on what those tests are.
We should reference the CIF and its ISO version. We do need more specific
for the vendors and what the need to include. Whitney likes the idea
of customizing. What do you mean by general population? How much should
they report? Should this be an addendum? It would be a template oriented
toward the voting application. The vendor should say how they recruit
people. We want to ensure consistency. Susan Roth will be looking at
this per Sharon L.
- When should we
send this to TGDC for vetting before December meeting? First weeks of
October? Maybe we should start looking at other subcommittees' work.
At least get some highlights from other subcommittees for HFP to review
for general comments.
Next meeting September
29, 2006, 11:00 a.m.
Taxonomy of Voting
System Records Production Approaches
Prepared for the STS Telecon
September 7, 2006
This is a brief,
high-level paper on voting system approaches for the purposes of ballot
records auditing. It presents an approach to categorizing these approaches
in the VVSG 2007 using the class structure. It is meant for the purposes
of discussion only.
We group different
approaches to voting system design into two broad categories: software-independent
and software-dependent approaches. Software-dependent approaches are best
exemplified by today's DRE systems: the accuracy of the captured votes
depends to a large extent on the accuracy of the software used to record
the votes. DREs do not produce other records that can be used to positively
verify the accuracy of the captured votes.
approaches, on the other hand, produce voting records in such a way that
their accuracy can be verified even if the voting system software contains
errors or deliberate fraud. Such approaches should be, in theory, less
expensive to test than software-dependent approaches. While VVPAT is one
example of this approach, some end-end cryptographic approaches are also
software-independent. The category Independent Dual Verification (IDV)
consists of a variety of different voting system approaches, including
current VVPAT and Op Scan (combined with Electronic Ballot Marking devices),
and the more theoretical Witness approaches.
While some of these
designs, e.g., VVPAT, are purely software-independent, other designs such
as Witness are somewhat software-dependent. This bears more explanation,
In VVPAT, for example,
the voter's indirect verification of the DRE's electronic record is backed
up by the voter's direct verification of the paper record. Furthermore,
the paper record cannot be changed by the voting system after the voter
has verified it, thus it can be used in useful comparisons with the electronic
record(s). Of course, some software is still involved and paper can be
mishandled at later stages, so further security measures are still required.
But, the two records can be compared for accuracy and errors/fraud in
the voting system software can be detected.
In the Witness design,
(Witness is a theoretical approach that no vendors admit to pursuing but
it is useful for illustrative purposes) a camera takes a picture of the
DRE's summary screen immediately after a voter finalizes his or her ballot
and the voter does
not verify that the picture was taken. One can imagine that the voting
system must somehow signal that the summary screen is being displayed
so that the photo can be taken - or some timing protocol must be in effect
so that the events are synchronized. For this to occur, software (or software
resident in hardware) must be trusted to work correctly. Even if a voter
is able to monitor the recording of the photo, software is involved.
Thus, one indirect
verification takes place - if the camera displays the photo it has taken,
two indirect verifications are possible. But, the camera-related software
involved is hopefully relatively small and thus more easily verified for
correctness than, say, the DRE itself. Two or more records are produced,
and the DRE's electronic records can be compared against the digital photos
This approach would
be preferred over the pure DRE approach. Consequently, some software-dependent
approaches are preferred over others. More testing of these approaches
is warranted, with some sort of a sliding scale going from IDV approaches
(less testing) to DRE approaches (more testing).
The high-level taxonomy
of software-independent and -dependent approaches, then, would be as follows:
a. End-End Cryptographic Voting Protocols
Paper Based Systems
MMPB/Op Scan (?)
Witness approaches (e.g., VoteGuard, http://www.democracysystems.com/)
Other schemes using 2 indirect verifications
to NIST HAVA Page
Last updated: July 25, 2007
Point of Contact
policy / security notice / accessibility statement
NIST is an agency of the U.S. Commerce Department