Słownik AI
Kompletny słownik sztucznej inteligencji
K-Anonymity
Data protection principle ensuring that each record in a dataset cannot be distinguished from at least k-1 other records for quasi-identifying attributes.
L-Diversity
Extension of k-anonymity requiring that each equivalence class contains at least l distinct values for sensitive attributes, thus limiting the inference of confidential information.
Hierarchical Generalization
Anonymization technique replacing specific values with more general categories according to a predefined hierarchy to reduce the granularity of quasi-identifying data.
Cell Suppression
Anonymization method consisting of replacing certain data values with missing or null values to prevent individual identification while preserving overall statistical utility.
Quasi-Identifiers
Set of attributes that, although not individually identifying, can be combined with external data to uniquely re-identify an individual in a dataset.
Equivalence Class
Group of records sharing the same generalized values for quasi-identifiers, forming the fundamental unit for checking compliance with k-anonymity criteria.
Differential Privacy
Mathematical formalization of privacy guaranteeing that the presence or absence of an individual in a database has a negligible impact on statistical query results.
Random Perturbation
Anonymization technique adding controlled random noise to numerical data to mask original values while preserving the overall statistical properties of the dataset.
Aggregation-based masking
A protection technique that combines multiple individual records into aggregated statistics to eliminate the possibility of isolating and identifying specific records.
Re-identification risk
The probability that an individual could be identified or their sensitive information deduced from anonymized data, often quantified by privacy breach metrics.
Microaggregation
A perturbation technique that applies small random modifications to individual records to maintain overall statistical properties while protecting specific records.
Domain hierarchy
A tree structure defining relationships between attribute values at different generalization levels, essential for implementing consistent generalization strategies.
Synthetic anonymity
An approach that generates artificial data statistically similar to the original but containing no real records, thus eliminating any direct re-identification risk.
Optimal partitioning
An algorithm that divides the dataset into equivalence classes of size k, minimizing information loss while satisfying k-anonymity constraints and preserving analytical utility.
Degradation factor
A quantitative metric measuring the loss of information or utility of a dataset after applying anonymization techniques, essential for evaluating the privacy-utility tradeoff.