Collaborative Filtering

📖

termini

Collaborative filtering

A recommendation approach that generates predictions by collecting preferences from many users, based on the principle that users with similar tastes in the past will have similar tastes in the future.

📖

termini

User-item matrix

A fundamental data structure where rows represent users, columns represent items, and cells contain ratings or interactions, typically very sparse as each user interacts with only a few items.

📖

termini

Cosine similarity

A similarity metric that calculates the cosine of the angle between two rating vectors in a multidimensional space, ranging from -1 (total opposition) to 1 (perfect similarity), widely used to compare user or item profiles.

📖

termini

Pearson correlation

A coefficient measuring the linear correlation between two rating vectors, centered on the mean and normalized, particularly effective at capturing relative rating trends between users or items.

📖

termini

k-nearest neighbors neighborhood

An algorithm identifying the k users or items most similar to a given target to form a neighborhood used in prediction calculation, where k is a crucial parameter controlling the granularity of recommendations.

📖

termini

User-based collaborative filtering

An approach based on similarity between users, recommending items that similar users have liked, requiring the calculation of user-user similarities and efficient management of evolving profiles.

📖

termini

Item-based collaborative filtering

A method based on similarity between items, recommending items similar to those already liked by the user, generally more stable and scalable than the user-based approach because items evolve less frequently.

📖

termini

Cold start problem

A major challenge where the system cannot generate reliable recommendations for new users or items due to a lack of historical data, requiring initialization strategies and active information collection.

📖

termini

Sparse matrix

Inherent characteristic of recommendation systems where the majority of cells in the user-item matrix are empty, posing computational challenges and requiring optimized data structures and imputation techniques.

📖

termini

Explicit vs implicit rating

Distinction between direct ratings voluntarily provided by users (explicit) and preference inferences based on observed behaviors (implicit), each with its own biases and specific processing methods.

📖

termini

Popularity bias

Systematic tendency of collaborative filtering to over-recommend popular items at the expense of diversity, potentially creating reinforcement loops and limiting the discovery of relevant niche items.

📖

termini

Rating prediction

Process of estimating the rating a user would give to an unrated item, typically calculated as a weighted combination of ratings from neighboring users or similar items with bias adjustments.

📖

termini

Top-N recommendations

Recommendation approach that generates a ranked list of the N most relevant items for a user, optimizing algorithms differently compared to exact rating prediction and requiring specific evaluation metrics.

📖

termini

Z-score normalization

Technique for standardizing user ratings by subtracting the mean and dividing by the standard deviation, allowing comparison of users with different rating scales and reducing the impact of individual biases.

📖

termini

Confidence score

Metric quantifying the reliability of a calculated similarity or prediction, typically based on the number of common ratings and their variance, used to weight contributions in recommendation calculations.

📖

termini

Temporal weighting

Method that assigns more weight to recent ratings compared to older ones, reflecting the evolution of user preferences and ensuring that recommendations remain relevant in the face of changing tastes.

📖

termini

Prediction aggregation

Process of combining multiple predictions from different neighbors or methods into a final recommendation, using techniques like weighted average, median, or meta-learning to optimize accuracy.

📖

termini

RMSE (Root Mean Square Error)

Standard evaluation metric measuring the square root of the mean of squared errors between predictions and actual ratings, heavily penalizing large errors and being sensitive to outliers.

📖

termini

MAE (Mean Absolute Error)

Metric calculating the average of absolute values of prediction errors, offering a more intuitive interpretation than RMSE and being less sensitive to outliers in the evaluation of recommendation systems.

Glossario IA