Team UCLANESL

Share

The Differential Privacy Synthetic Data Challenge
5th Place - $3,000 Prize

*An additional $4,000 was awarded for posting their full code solution in an open source repository.

About the Team

The team UCLANESL has members with different levels of experience and academic backgrounds. All team members share a strong interest in privacy and machine learning. Members: Moustafa Alzantot (UCLA), Supriyo Chakraborty (IBM Research), Nathaniel Snyder (UCLA), and Mani Srivastava (UCLA).

Team UCLANESL used the NIST Collaboration Space as their open source repository and can be accessed here. *Note that other contestant source code may also be found on this site.

Mani Srivastava is on the faculty at UCLA where he is associated with the ECE Department with a joint appointment in the CS Department. His research group, the UCLA Networked and Embedded Systems Laboratory, conducts research in the broad area of embedded and cyber-physical systems for applications in IoT, ubiquitous and mobile computing, and pervasive sensing and control. His interests are in problems related to making these systems learning-enabled, secure, privacy-aware, human-coupled, wirelessly-networked, and energy-efficient in the context of systems and applications for mobile health and sustainable buildings. He is a fellow of both the ACM and the IEEE. For more details about Mani, please see http://www.seas.ucla.edu/~mbs.

Moustafa Alzantot is a fifth year Ph.D. student at the Department of Computer Science, UCLA. His research interests are adversarial machine learning, privacy and security of machine learning models. He is also broadly interested in novel applications of machine learning. He has published many research papers in machine learning and mobile sensing conferences and journals and he is an inventor of two U.S. patents. Among others, he is the recipient of several awards including the COMESA 2013 innovation award and the UCLA graduate division award. He interned at several companies and research labs including: Google, Facebook, and Bell-labs.
Find more about Moustafa at his personal webpage:http://web.cs.ucla.edu/~malzantot/

Supriyo Chakraborty is researcher working with the Distributed Artificial Intelligence group at IBM T.J. Watson Research Lab, U.S. Prior to this he obtained his Ph.D. in Electrical Engineering from the University of California, Los Angeles. He is the recipient of several awards including the Dissertation Year Fellowship award at UCLA, the Qualcomm Innovation Fellowship award, and the IBM Outstanding Technical Achievement award. His research interests are in information theoretic privacy, differential privacy, applied cryptography and adversarial machine learning. He has published more than 40 journal and conference papers in top-tier venues, and authored more than 20 patents. For more details about Supriyo please visit: https://researcher.watson.ibm.com/researcher/view.php?person=us-supriyo

Nathaniel Snyder is a second year Systems and Control PhD student at UCLA within the Department of Mechanical and Aerospace Engineering. He works cross-departmental within the Network and Embedded Systems Lab (NESL) under Dr. Mani Srivastava. His research has been focused around developing machine learning safety and sensing algorithms for autonomous systems. His academic interests are mechatronics, machine-learning and control/estimation. For more details about Nathaniel please visit:http://nesl.ee.ucla.edu/people/494

The Solution

Differentially Private Synthetic Data Generation using Wasserstein GANs

Team UCLANESL's solution is based on a combination of generative adversarial networks (GAN) model, a state of the art techniques for training generative models, with moments accountant, a state-of-the-art approach for differential privacy loss tracking. During GAN training, they simultaneously train two competing neural network models. The first model, the generator, learns how to generate synthetic records by mapping its input which is a randomly sampled noise vector to samples drawn from the distribution it learns about training data. The second model, discriminator, learns how to distinguish between samples drawn from the training dataset and samples generated by the generator model. Alternately, the team updates the discriminator by giving it labeled examples from both generator outputs and training data, and they update the generator by giving it feedback from the discriminator output on the samples it produces. Eventually, the generator model learns how to produce synthetic examples that closely mimic the samples of the training dataset. Therefore, at the end of the training, they use the generator to produce synthetic records through feeding random noise vectors into it. To address, the challenges of instability in GAN training, Team UCLANESL uses the Wasserstein distance function as the GAN training objective.

To ensure that their model satisfies the differential privacy requirement, they apply the Gaussian mechanism to the discriminator gradient updates. This is achieved in two main steps. First, they gradient update for each example and clip them to ensure they have a bounded sensitivity. Second, the team adds Gaussian noise to the clipped gradients before applying them to the discriminator model weights. Since the generator is updated solely based on the feedback it receives from discriminator, this will also make the generator differentially private due to the post-processing invariance of the differential privacy. Finally, the team use the moments' accountant as a tool to keep track of the privacy loss and, if needed, abort the training once we have reached the limits of permissible privacy budget.

Back to Differential Privacy Synthetic Data Challenge Page

Public Safety Communications Research Division

Team UCLANESL

Share

The Differential Privacy Synthetic Data Challenge 5th Place - $3,000 Prize

About the Team

The Solution

The Differential Privacy Synthetic Data Challenge
5th Place - $3,000 Prize