Abstract: De-identification removes identifying information from a dataset so that individual data cannot be linked with specific individuals. De-identification can reduce the privacy risk associated with collecting, processing, archiving, distributing or publishing information. De-identification thus attempts to balance the contradictory goals of using and sharing personal information while protecting privacy. Several U.S laws, regulations and policies specify that data should be de-identified prior to sharing. In recent years researchers have shown that some de-identified data can sometimes be re-identified. Many different kinds of information can be de-identified, including structured information, free format text, multimedia, and medical imagery. This document summarizes roughly two decades of de-identification research, discusses current practices, and presents opportunities for future research.
Citation: NIST Interagency/Internal Report (NISTIR) - 8053Report Number:
NIST Pub Series: NIST Interagency/Internal Report (NISTIR)
Pub Type: NIST Pubs
De-identification, HIPAA Privacy Rule, k-anonymity, differential privacy, re-identification, privacy