NIST logo

Publication Citation: A New Analysis of the False-Positive Rate of a Bloom Filter

NIST Authors in Bold

Author(s): Ken Christensen; Allen L. Roginsky; Miguel Jimeno;
Title: A New Analysis of the False-Positive Rate of a Bloom Filter
Published: October 15, 2010
Abstract: A Bloom filter is a space-efficient data structure used for probabilistic set membership testing. When testing an object for set membership, a Bloom filter may give a false positive. The analysis of the false positive rate is key to understanding the Bloom filter and applications that use it. We show experimentally that the classic analysis for false positive rate is wrong. We formally derive a correct formula using a balls-and-bins model and show how to numerically compute the new, correct formula in a stable manner. We also prove that the new formula always results in a predicted greater false positive rate than the classic formula. This correct formula is numerically compared to the classic formula for relative error for a small Bloom filter the relative error is shown to be significant.
Citation: Information Processing Letters
Volume: 110
Issue: 21
Pages: pp. 944 - 949
Keywords: data structures; analysis of algorithms; Bloom filters
Research Areas: Data and Informatics, Statistics, Statistics, Information Technology, Math
DOI: http://dx.doi.org/10.1016/j.ipl.2010.07.024  (Note: May link to a non-U.S. Government webpage)
PDF version: PDF Document Click here to retrieve PDF version of paper (380KB)