Glosarium AI
Kamus lengkap Kecerdasan Buatan
Quantization by Clustering
Model compression technique that groups similar weights into clusters to reduce memory while preserving performance. This approach enables compact weight representation using a limited number of representative centroids.
K-means Quantization
Clustering algorithm applied to neural network weight quantization by partitioning the weight space into K clusters. Weights are then represented by their respective cluster centroids, thus reducing the required precision.
Codebook
Set of reference vectors or centroids used to represent quantized weights in a compressed model. The codebook enables mapping original weights to low-precision representations while minimizing reconstruction error.
Quantization Centroids
Representative points at the center of each cluster in the quantization space, serving as substitutes for original weights. These centroids are optimized to minimize the model's overall quantization error.
Product Quantization
Advanced technique decomposing the vector space into subspaces and quantifying each separately before combining the codes. This method enables extreme compression with minimal information loss for high-dimensional models.
Optimized Product Quantization
Variant of Product Quantization that applies a linear transformation before subspace decomposition to optimize weight distribution. This pre-transformation significantly improves the final quantization quality.
Additive Quantization
Approach where vectors are approximated by the sum of multiple quantized codes from different codebooks. This method offers better representation flexibility than single-codebook approaches.
Residual Quantization
Iterative technique successively quantifying residuals not captured by previous quantization steps. Each iteration refines the approximation by capturing the model's remaining errors.
Hierarchical Clustering Quantification
Method organizing weights into a tree-structured cluster hierarchy for efficient multi-level quantification. This hierarchy enables an adjustable trade-off between precision and storage complexity.
Subspace Quantification
Technique dividing the weight space into orthogonal subspaces to independently quantify each dimension. This approach reduces computational complexity while preserving the model's essential characteristics.
Mahalanobis Distance in Quantification
Adaptive distance metric accounting for covariance between weights for more informative clustering. This approach improves the quality of formed groups by considering the model's structural correlations.
Codebook Learning
Optimization process of centroids to minimize the global reconstruction error of the quantified model. This crucial step determines the final quality of compression and model performance.
Coarse Quantizer
First quantification level performing coarse grouping of weights into broad clusters. This fast step reduces the search space for finer quantification stages.
Fine Quantizer
Detailed quantification level operating on restricted subspaces for precise weight approximation. This step refines the representation after the initial grouping performed by the coarse quantizer.
IVF with Quantification
Combination of Inverted File Index with quantification techniques for efficient search in compressed models. This hybrid approach optimizes both indexing and compact weight representation.
PQ-Codes
Compact binary representations resulting from Product Quantification for each weight vector. These codes enable fast comparisons and efficient storage while preserving essential information.
Lattice Quantization
Méthode utilisant des structures géométriques régulières (réseaux) pour partitionner l'espace des poids de manière uniforme. Cette approche garantit des propriétés théoriques optimales pour l'erreur de quantification.