AI-woordenlijst
Het complete woordenboek van kunstmatige intelligentie
MobileNet
Convolutional neural network architecture specifically designed for mobile and embedded applications, using depthwise separable convolutions to significantly reduce the number of parameters and computations.
SqueezeNet
Ultra-lightweight CNN architecture that achieves AlexNet-level accuracy with 50 times fewer parameters by using 'fire' modules that compress and then expand spatial features.
Fire Module
Fundamental module of SqueezeNet composed of a 'squeeze' layer (1x1 convolution) that reduces dimensions, followed by an 'expand' layer (mix of 1x1 and 3x3 convolutions) that increases them.
Quantization
Process of reducing the numerical precision of model weights and activations (typically from 32-bit to 8-bit) to decrease model size and accelerate inference.
Pruning
Technique of removing unnecessary connections or neurons in a trained neural network to reduce its complexity without significant loss of performance.
Model Compression
Set of techniques aimed at reducing the size and computational complexity of AI models while preserving their accuracy, essential for deployment on resource-constrained devices.
Edge Computing
Computing paradigm where processing is performed locally on edge devices rather than in the cloud, reducing latency and preserving data privacy.
On-device Inference
Execution of model predictions directly on the terminal device (smartphone, IoT) without server connection, ensuring instant response and offline functionality.
Latency
Time elapsed between data input and result retrieval, critical metric for mobile applications where low latency (<100ms) is required for a smooth user experience.
Throughput
Number of inferences or operations a model can process per unit of time, key indicator for evaluating model performance under resource constraints.
Parameter Efficiency
Ratio between model performance and its number of parameters, measuring how efficiently the network uses its weights to achieve a given accuracy.
FLOPs
Number of floating-point operations required per inference, standard metric for comparing the computational complexity of different CNN architectures.
Model Size Optimization
Systematic process of reducing the model's memory size through architectural techniques, quantization, and compression to fit mobile storage constraints.
Neural Architecture Search
Automation of optimal neural architecture design for specific constraints (latency, size, power consumption), particularly useful for mobile models.
EfficientNet
Family of CNN architectures that optimally balance depth, width, and resolution to achieve superior performance with maximum computational efficiency.
ShuffleNet
Ultra-lightweight architecture using pointwise grouped convolutions with a channel shuffling mechanism to reduce computational costs while preserving feature richness.