Glosarium AI
Kamus lengkap Kecerdasan Buatan
YOLO (You Only Look Once)
Real-time object detection algorithm that processes the entire image in a single pass, dividing the image into a grid and simultaneously predicting bounding boxes and probability classes for each cell.
R-CNN (Region-based CNN)
Pioneering object detection architecture that uses selective region proposals followed by a CNN to extract features, then SVMs to classify each proposed region.
Fast R-CNN
Improvement of R-CNN that shares computations between region proposals using ROI pooling to efficiently extract features and combines classification and regression in a single network.
Faster R-CNN
Advanced architecture integrating a Region Proposal Network (RPN) that shares convolutional features with the detection network, eliminating the need for external selective search.
Anchor Box
Predefined boxes of different dimensions and ratios used as references to predict bounding boxes, serving as anchor points to improve object localization accuracy.
Bounding Box
Rectangle defined by coordinates (x, y, width, height) that delimits the position of a detected object in an image, used for precise spatial localization of elements.
Non-Maximum Suppression (NMS)
Post-processing algorithm that eliminates redundant detections by keeping only the boxes with the highest scores and removing those that overlap beyond a defined IoU threshold.
Region Proposal Network (RPN)
Convolutional neural network that directly generates candidate region proposals using anchor boxes and predicting object probabilities and box adjustments for each location.
Intersection over Union (IoU)
Evaluation metric measuring the overlap between the predicted box and the ground truth box, calculated as the ratio of the intersection over the union of the two boxes.
Feature Pyramid Network (FPN)
Architecture combining multi-scale features through top-down and lateral connections, improving the detection of objects at different sizes in the same image.
Single Shot Detector (SSD)
Unified object detector eliminating region proposals by directly predicting boxes and classes from feature maps at different scales for efficient multi-scale detection.
Mask R-CNN
Extension of Faster R-CNN adding a segmentation branch predicting binary masks for each object, simultaneously performing detection, classification, and instance segmentation.
Object Detection
Computer vision task combining localization and classification to identify and delimit multiple objects in an image with bounding boxes and category labels.
mAP (Mean Average Precision)
Standard evaluation metric in object detection calculating the mean of average precisions across all classes, integrating precision, recall, and IoU thresholds for overall performance.
Backbone Network
Fundamental CNN network (e.g., ResNet, VGG) extracting hierarchical features from the image, serving as the base for detection heads in modern architectures.
Strided Convolution
Convolutional operation with stride greater than 1, reducing the spatial dimensions of the feature map while increasing the receptive field to capture wider contexts.