If a classifier can easily distinguish between privatized and ground truth data, the datasets are fundamentally different, and the privatized data should not be used for downstream analysis. Conversely, if a classifier cannot distinguish them, we should feel comfortable using the privatized data going forward. In the latter case, we prove that any classifier from the same function family will have essentially the same loss on your private and ground truth data.
We define a normalized version of this maximum difference in loss as the separability and provide an algorithm for computing it empirically.
Prize amount: $2,000
Team members: Leo Hentschker, Kevin Lee