YZ Sözlüğü
Yapay Zekanın tam sözlüğü
Precision@K
Metric measuring the proportion of relevant items among the top K recommendations, essential for evaluating the quality of top-ranked results.
Recall@K
Indicator calculating the ratio of relevant items actually present in the top K recommendations compared to the total available relevant items.
Mean Average Precision (MAP)
Aggregated metric calculating the average of precisions at each relevant position, weighted by the rank of each relevant item in the recommendation list.
NDCG (Normalized Discounted Cumulative Gain)
Normalized score evaluating ranking quality by penalizing relevant items placed far from the top of the list, ideal for recommendations with graded relevance.
RMSE (Root Mean Square Error)
Root mean square error used to evaluate rating prediction accuracy by measuring the difference between predicted and actual values.
Hit Rate (HR)
Percentage of sessions where at least one relevant item appears in the top N recommendations, measuring the overall effectiveness of the system.
Catalog Coverage
Percentage of unique catalog items that can be recommended by the system, crucial to avoid concentration on a limited subset of items.
Intra-List Diversity
Measure of average dissimilarity between items in the same recommendation list, essential to avoid redundancy and enhance user experience.
Novelty
Degree of unknown of recommended items for the user, calculated as the inverse of their global popularity in the catalog.
Serendipity
Ability of the system to recommend relevant but unexpected items that positively surprise the user beyond simple predictions.
A/B Testing
Experimental methodology comparing the performance of two versions of the system on real user segments to measure business impact.
Leave-One-Out Cross-Validation
Robust evaluation technique where each user interaction is alternately used as test data while others serve for training.
Offline vs Online Evaluation
Dual approach evaluating performance on historical data (offline) and with real interactions (online) to validate the complete effectiveness of the system.
Temporal Generalization
Ability of the system to maintain its performance on future data, evaluated sequentially on temporal splits rather than random ones.
Business Metrics Correlation
Analysis of the relationship between algorithmic metrics (NDCG, Precision) and business indicators (conversion, retention) to validate business relevance.
Cataract Metric
Composite score balancing precision, diversity, novelty, and coverage to holistically evaluate the overall quality of recommendations.
Expected Reciprocal Rank (ERR)
Probabilistic model based on user behavior assuming cessation of examination after the first click, heavily weighting the first positions.
User Coverage
Percentage of users for whom the system can generate recommendations, critical for measuring the universal applicability of the system.
Fairness Metrics
Indicators evaluating the equity of recommendation distribution among different demographic groups to avoid algorithmic biases.
Exposure Bias Measurement
Quantification of the exposure disparity between popular and long-tail items, essential for evaluating recommendation balance.