Słownik AI
Kompletny słownik sztucznej inteligencji
Robust scaling
Technique using quantiles to resist outliers, typically applying (x - median)/IQR where IQR represents the interquartile range. This approach maintains transformation stability even in the presence of noisy or extreme data.
L1 normalization
Scaling method dividing each value by the absolute sum of all values in the vector, ensuring the L1 norm equals 1. This transformation is particularly useful for probability-based models and sparse representations.
L2 normalization
Process normalizing vectors by dividing each component by the square root of the sum of squares, ensuring a unit Euclidean norm. This technique is essential for algorithms sensitive to vector magnitude like SVMs and neural networks.
Quantile normalization
Non-parametric technique transforming data to follow a specified uniform or normal distribution using quantile cumulative functions. This approach is particularly effective for handling strongly skewed or multimodal distributions.
Vector unit scaling
Normalization dividing each vector by its Euclidean norm, resulting in unit-length vectors in multidimensional space. This method is crucial for algorithms based on cosine similarity measures and text representations.
Decimal normalization
Simple technique dividing values by a power of 10 to bring them into the [-1,1] interval, based on the maximum number of digits before the decimal. This method preserves relative magnitude order while reducing absolute numerical scale.
Robust standardization
Standardization variant using median and median absolute deviation (MAD) as measures of central tendency and dispersion, offering increased resistance to outliers. This approach maintains interpretability while ensuring robustness.
Logarithmic scaling
Transformation applying log(x + c) where c is a constant to handle null values, effectively compressing the scale of large values. This method is particularly suited for data following a power law or exhibiting right skewness.
Rank Normalization
Non-parametric technique replacing each value with its normalized rank in the dataset, eliminating the influence of extreme values. This approach is robust to outliers and preserves only the relative order of observations.
Median Standardization
Method centering data around the median rather than the mean, dividing by a robust dispersion measure such as the interquartile range. This approach offers better resistance to skewed distributions and outliers.
Maximum Absolute Scaling
Simple technique dividing each value by the maximum absolute value of the feature, preserving signs and zeros while bounding values in [-1,1]. This method is particularly effective for already centered or sparse data.
Variance Normalization
Standardization process dividing variables by their variance, thus equalizing the importance of each feature in scale-sensitive algorithms. This approach is particularly useful for principal component analysis and ridge regression.
Coefficient of Variation Standardization
Advanced method normalizing data by dividing by the coefficient of variation (σ/μ), allowing comparison of variables with different means and variances. This technique is particularly relevant for data where relative variability is more important than absolute variability.