The Challenge

Are you a mathematician or data scientist interested in a new challenge? Then join this exciting data privacy competition with up to $150,000 in prizes, where participants will create new or improved differentially private synthetic data generation tools.
When a data set has important public value, but contains sensitive personal information and can’t be directly shared with the public, privacy-preserving synthetic data tools solve the problem. Using common analytics tasks such as clustering, classification, and regression, a synthetic (or artificial) dataset can be produced to serve as a practical replacement for the original sensitive data. By mathematically proving that a synthetic data generator satisfies the rigorous Differential Privacy guarantee, we can be confident that the synthetic data it produces won’t contain any information that can be traced back to specific individuals in the original data.
How important is this?
There is no absolute protection that data will not be misused. Even a dataset that protects individual identities may, if it gets into the wrong hands, be used for ill purposes. Weaknesses in the security of the original data can threaten the privacy of individuals. This challenge is focused on proactively protecting individual privacy while allowing for public safety data to be used by researchers for positive purposes and outcomes. NIST PSCR has strong commitments to both public safety research and the preservation of security and privacy, including the use of de-identification. It is well known that privacy in data release is an important area for the Federal Government (which has an Open Data Policy), state governments, the public safety sector and many commercial non-governmental organizations. Developments coming out of this competition would hopefully drive major advances in the practical applications of differential privacy for these organizations.
Congratulations to the Winners of Match #1!
Team pfr
1st Place $10,000
Team RMcKenna
2nd Place $7,000
Team DPSyn
3rd Place $5,000
Team UCLANESL
4th Place $2,000
Team PrivBayes
5th Place $1,000

Team RMcKenna, Team pfr, Team DP-D, and Team Epsilon-delta
Progressive Prize — Winners Earned $1,000 Each
*Award based on placement on the provisional leaderboard halfway through the match
Challenge Details
The Differential Privacy Synthetic Data Challenge asks competitors to join and participate in any of the three marathon matches, run on the Topcoder platform. Contestants will design and implement their own synthetic data generation algorithms, mathematically prove their algorithm satisfies differential privacy and compete against others' algorithms.
Join us for Match #2, which begins on January 11th. Register now!
Match #3 will begin on March 10, 2019.
All submissions must be made at the Topcoder website.
All contest details, contest timeline, and marathon match descriptions can be found on challenge.gov and the Topcoder website.