2010-05-14 14:00 EDT Marginal Marks Working Group Minutes rev. 2010-05-26 - * - Pursuant to the Help America Vote Act of 2002, the Technical Guidelines Development Committee (TGDC) is charged with directing NIST to conduct voting systems research so that the TGDC can fulfill its role of recommending technical standards for voting equipment to the Election Assistance Commission (EAC). This teleconference discussion is for the purposes of the Auditability Working Group of the TGDC to direct NIST staff and coordinate voting related research relevant to the development of Voluntary Voting Systems Guidelines (VVSG). Discussions on this teleconference including referenced discussion papers are preliminary and pre-decisional, and do not necessarily reflect the views of the National Institute of Standards and Technology, or the TGDC. - * - Agenda: The plan that I had formulated for NIST research on this item, involving formal data collection and analysis from actual ballots, is currently in process in Acquisitions, which limits my freedom to discuss details outside of the process before the solicitation is approved and published. This nevertheless leaves us able to share experiences with optical scan ballots and informally go about making a list of the kinds of marks that ought to appear in a test set. Attendees: David Flater Nelson Hastings Doug Jones Sharon Laskowski Ben Long Russ Ragsdale David Wagner -- Preface "Marginal Marks Working Group" is a misnomer if terminology from the 2007 draft of VVSG 2.0 is used. VVSG 2.0 defines marginal marks as ambiguous marks appearing *within* a voting target. The scope of the working group includes marks that are outside of voting targets--those that VVSG 2.0 defines as "extraneous marks." This first meeting of the Marginal Marks Working Group was used to review the background and motivations of the planned work. -- Voter behaviors Doug had circulated a draft paper, "Ballot Marks from the Humboldt County 2008 General Election" (Mascher, Mascher & Jones), that described classes of nonconforming marks found in the Humboldt County November 2008 data. Russ had forwarded this to Colorado SoS for information as they are working on a revision of their rules for opscan counting. David Flater noted that, in a review of opscan ballots left over from previous human factors work, the issues with write-ins and the fact that they occurred frequently coincided with Doug's findings. It was noteworthy from Doug's paper that, even with focused voter education by the Rodini campaign, 4.92% of write-ins for Rodini were dismissed, usually because the target was not filled. For other write-in candidates, the dismissal rate was 8.23%. -- Processing of nonconforming marks David Wagner described in general terms an approach to post-processing ballots to detect and outstack those ballots with wrongly done write-ins or lots of extraneous marks. Doing this after the fact from scanned images eliminates the real-time requirement and seems feasible. Online resolution from ballot images is an alternative to outstacking. Doug noted that this has been done in practice with Hart systems, but it necessitates an audit model to reconstruct the link between the digital image and the original ballot. Doug said that ES&S considered detecting questionable ballots in real-time in the M100 but it was too CPU-constrained. With modernized equipment it should be feasible to do it in real-time. For precinct count, real-time is a requirement. David Flater observed that the approach of outstacking (central count) or rejecting back to the voter (precinct count) questionable ballots is an easier problem than trying to automate interpretation of the nonconforming marks--but the deliverable for the working group is the same anyway. We wanted a test set of markings to evaluate the behavior of optical scanners when they are confronted by nonconforming marks. Whether that behavior is to outstack/reject or to try and interpret the marks makes no difference for what test ballots we would need to input. Russ Ragsdale noted that in a central count environment it is possible to have a manual process to pull out poorly marked ballots and route them to a duplication board. In precinct count, there can be no such process, so any special handling of poorly marked ballots requires that the scanner be capable of detecting them. Russ and Doug agreed that there is currently a wide range of practices and no consensus on the handling of poorly marked ballots, and that the current de facto standard behavior for scanners is not to support a distinct behavior for marginal marks. -- What makes a ballot questionable The following would not be interesting: - Anything on a write-in line with a filled target - Anything within a target [*] - "Reasonable" excursions from the target - X and check marks that exit the target by a "reasonable" distance - Smudges that don't extend to other voting targets [*] DF postscript: Except for marginal marks as defined in the VVSG 2.0 draft; in precinct count these should be returned for clarification. Other significant markings appearing anywhere on the ballot would be interesting. Marks that are "sufficiently near" the target but have long tails that extend away from the target (e.g., a check mark that extends across to the other column) become questionable at some point. This tolerance is worth testing. -- Feasibility of automatically detecting questionable ballots while the voter is waiting David Flater thought this should be easily achievable. David Wagner described previous work that began with having scans for all of the ballots. Questionable ballots can be identified by their deviation from the norm calculated over the entire collection. However, the problem is not obviously easy if that collection is unavailable. Each ballot must be compared with something, some canonical blank ballot. Doug elaborated that the problem would be significantly complicated by the delivery of blank ballots by Internet, where the vagaries of PC printers and bad choices in the Print dialog box would vastly increase the variability of ballots compared with professionally printed ballots. In support of feasibility, Doug cited Adi et al., Demonstration of "Open Counting": A Paper-Assisted Voting System with Public OMR-At-A-Distance Counting, available at http://www.vocomp.org/proceedings.php.html. Ballots were interpreted from still images selected from a low-budget video feed in 1-2 s per ballot. The image processing challenge came not from nonconforming marks but from the limited quality of the input. However, the ballots in this demo were designed to facilitate the objective and were not typical of what we are dealing with. (Also, the selection of still images was manual.) David Wagner asked NIST to do further research on the feasibility question with respect to existing implementations, impact on system cost and constraints on design. -- The "machine model" of what is a valid vote One school of thought for defining what is a vote for counting purposes is to define it in terms of what the tabulator was designed to count, which hopefully is also the mark the voters were instructed to make. This is in opposition to the "voter intent" model that appears in many [other] jurisdictions' election laws as well as the principles of user-centric design. David Flater expressed sympathy for the machine model since it led more easily to objective and testable requirements on voting systems and simpler arbitration of questionable ballots, but he also acknowledged that, if studies are reporting that this policy has disparate effects on different subgroups of voters, it's a dead end. Doug cited reports on the work of John V. McMillin and Robert J. Urosevich in the 1970's, which always assumed that questionable ballots should be outstacked for human review, as evidence that the machine model grew out of legal convenience, not engineering or testing needs. References available at http://www.lib.uiowa.edu/spec-coll/archives/guides/RG99.0023.htm esp. unpublished "WLC Ballot Scanning Days." Doug mentioned an EAC Recounts and Contests Study that is in draft form and currently under review by the Board of Advisors: "My concern about this draft is that it seems to enshrine the machine model as a best practice, recommending that states adopt it. It makes no mention of the ballot resolution process, and if states generally follow this model, it leaves the designers of mark-sense scanners with no guidance about what their scanners should do." -- Technology churn David Flater noted factors suggesting that the design of optical scan ballots might be destined for upheaval in the near future. Using write-in votes as an example, the high rate of voter errors (not filling the bubble) leads to the evolution of scanners that detect the nonconforming write-ins, which leads to the bubble itself being a redundant artifact, which leads to a redesign that eliminates the bubble. This creates an obsolete-on-delivery risk for the test set. Doug acknowledged the risk but noted on the other hand that the design of optical scan ballots (1) has been stable for a long time and (2) has been written into election law. David Wagner acknowledged the risk but accepted that any test set would need maintenance to track evolving needs. -- Future meetings The working group agreed to defer scheduling further meetings until there was progress to discuss. Adjourned at 3:20 PM EDT.