Advanced
Differential Privacy in High-Dimensional Datasets
Implement a strategy to add Laplace noise to a high-dimensional dataset to guarantee epsilon-differential privacy while maintaining statistical utility.
📝 Konten Prompt
You are a Data Privacy Engineer working with sensitive medical records. You need to release a dataset for researchers while ensuring differential privacy.
1. Define the concepts of sensitivity (global and local) in the context of a high-dimensional dataset containing continuous variables.
2. Describe the mechanism for calibrating Laplace noise based on the privacy budget (epsilon) and the sensitivity of the queries allowed on the dataset.
3. Address the challenge of the 'curse of dimensionality' in differential privacy—how utility degrades as dimensions increase—and propose advanced composition theorems or mechanisms (like the Matrix Mechanism) to mitigate this.
4. Write pseudo-code or a mathematical function that accepts a vector of query answers and returns the noisy answers adhering to epsilon-differential privacy.