🏠 Trang chủ
Benchmark
📊 Tất cả benchmark 🦖 Khủng long v1 🦖 Khủng long v2 ✅ Ứng dụng To-Do List 🎨 Trang tự do sáng tạo 🎯 FSACB - Trình diễn cuối cùng 🌍 Benchmark dịch thuật
Mô hình
🏆 Top 10 mô hình 🆓 Mô hình miễn phí 📋 Tất cả mô hình ⚙️ Kilo Code
Tài nguyên
💬 Thư viện prompt 📖 Thuật ngữ AI 🔗 Liên kết hữu ích

Thuật ngữ AI

Từ điển đầy đủ về Trí tuệ nhân tạo

162
danh mục
2.032
danh mục con
23.060
thuật ngữ
📖
thuật ngữ

Quantization Aware Training (QAT)

Optimization method where low-precision quantization simulation is integrated during training, allowing the model to adapt its weights to minimize the performance loss induced by quantization.

📖
thuật ngữ

Low-Rank Adaptation (LoRA)

Efficient adaptation method that freezes the weights of a pre-trained model and injects small decomposable low-rank matrices, drastically reducing the number of trainable parameters for fine-tuning while preserving performance.

📖
thuật ngữ

8-bit Floating Point Representation (FP8)

Very low-precision numerical data format using 8 bits to represent floating-point numbers, enabling significant accelerations on modern GPUs while maintaining the training stability of large models.

📖
thuật ngữ

4-bit Integer Quantization (INT4)

Extreme compression technique representing model weights on 4 bits, requiring advanced quantization algorithms and often partial retraining to compensate for significant information loss.

📖
thuật ngữ

Quantization Bias Compensation (Q-Bias)

Post-quantization adjustment technique that systematically analyzes and corrects the biases introduced by precision reduction, often by modifying normalization layers or the biases of linear layers.

📖
thuật ngữ

Quantization Grid Search Optimization

Systematic exploration method of different quantization configurations (per-layer, per-group, mixed) to identify the optimal scheme offering the best balance between model size, speed, and precision for a given architecture.

📖
thuật ngữ

Speculative Inference

Generative inference acceleration technique where a small 'draft' model quickly proposes multiple tokens, which are then validated in parallel by the large target model, reducing the total number of costly computation steps.

📖
thuật ngữ

Truncated Singular Value Decomposition (Truncated SVD)

Application of SVD decomposition followed by truncation of the smallest singular values to approximate a weight matrix by a lower-rank sum, thus reducing parameters and computation with controlled error.

📖
thuật ngữ

Block-wise Quantization

Quantization strategy that divides weight tensors into smaller blocks and applies independent quantization to each block, better preserving the value distribution and reducing the overall error compared to global quantization.

📖
thuật ngữ

Structured Sparse Weights

Form of pruning that imposes regularity patterns (by row, column, or block) on the pruned weights, allowing efficient exploitation of hardware accelerations on CPUs/GPUs unlike random unstructured sparsity.

🔍

Không tìm thấy kết quả