AI Glossary
The complete dictionary of Artificial Intelligence
Binary Neural Networks
Neural networks whose weights and activations are constrained to binary values (+1/-1), offering extreme compression and significant inference speed gains.
Structured Pruning
Pruning technique removing entire structures like filters, channels or complete layers, enabling real hardware gains unlike unstructured pruning.
Dynamic Computation
Strategy adapting the model's computational complexity based on input or resource constraints, optimizing energy usage and latency on edge devices.
TensorRT Optimization
NVIDIA optimization suite including layer fusion, precision calibration and auto-tuning to maximize inference performance on edge GPUs.
TinyML
Machine learning domain targeting the deployment of ultra-compact AI models (<1MB) on microcontrollers with extremely limited resources (RAM <256KB).
ONNX Runtime
Cross-platform inference engine optimizing the execution of ONNX format models on various hardware architectures including edge and IoT devices.
Post-training Quantization
Quantization technique applied after complete model training, using a small calibration set to determine optimal quantization parameters.
Sparse Neural Networks
Neural networks containing a large proportion of zero or near-zero weights, enabling significant computational and storage optimizations on edge platforms.
Layer Fusion
Optimization combining multiple successive layers into a single computational operation, reducing memory overhead and improving parallelism on edge accelerators.