🏠 Hem
Benchmarkar
📊 Alla benchmarkar 🦖 Dinosaur v1 🦖 Dinosaur v2 ✅ To-Do List-applikationer 🎨 Kreativa fria sidor 🎯 FSACB - Ultimata uppvisningen 🌍 Översättningsbenchmark
Modeller
🏆 Topp 10 modeller 🆓 Gratis modeller 📋 Alla modeller ⚙️ Kilo Code
Resurser
💬 Promptbibliotek 📖 AI-ordlista 🔗 Användbara länkar

AI-ordlista

Den kompletta ordlistan över AI

162
kategorier
2 032
underkategorier
23 060
termer
📖
termer

LinUCB

Contextual bandit algorithm assuming a linear relationship between context and expected reward. Uses an upper confidence bound to optimally balance exploration and exploitation.

📖
termer

Contextual Thompson Sampling

Bayesian approach for contextual bandits that samples parameters from their posterior distribution. Selects the arm maximizing the expected reward according to this sample for natural exploration.

📖
termer

Context Vector

Vector representation of observable environmental characteristics at a given time. Serves as the basis for contextual bandit models to predict conditional rewards.

📖
termer

Contextual Regret Rate

Performance measure quantifying the cumulative difference between the obtained reward and that of the best fixed policy a posteriori. Allows evaluation of the effectiveness of contextual bandit algorithms.

📖
termer

Kernel Bandits

Extension of contextual bandits using kernel methods to capture non-linear relationships between context and reward. Enables flexible modeling without strict linearity assumptions.

📖
termer

Matrix Factorization for Bandits

Technique combining contextual bandits and matrix factorization to handle high-dimensional action or context spaces. Efficiently shares information between different contextual configurations.

📖
termer

Hierarchical Bandits

Structure of contextual bandits organized into multiple levels where high-level decisions influence choices available at lower levels. Enables structured and efficient decision-making.

📖
termer

Contextual Exploration

Adaptive exploration strategy taking into account contextual information to optimize data collection. Reduces regret by focusing on the most promising contextual regions.

📖
termer

Bandits with Delayed Feedback

Variant of contextual bandits where the reward is only observed after a significant delay. Requires adapted algorithms to handle temporal uncertainty and maintain efficient learning.

📖
termer

Non-Stationary Bandits

Contextual bandit problem where the reward distribution evolves over time. Requires algorithms capable of adapting to changes to maintain optimal performance.

📖
termer

Adversarial Bandits

Framework where rewards are generated by an adversary rather than following a fixed stochastic distribution. Requires robust strategies guaranteeing worst-case regret bounds.

📖
termer

Bandits with Constraints

Extension of contextual bandits incorporating constraints on resources or costs. Optimizes rewards while respecting limitations imposed by the environment.

📖
termer

Policy Learning

Approach where the algorithm directly learns a policy function mapping contexts to optimal actions. Avoids explicit value estimation for more direct decision-making.

📖
termer

Combinatorial Bandits

Generalization allowing simultaneous selection of multiple arms with combinatorial constraints. Applied to online advertising, set recommendation, and portfolio optimization.

📖
termer

Meta-Learning for Bandits

Approach transferring knowledge acquired across multiple bandit tasks to accelerate learning on new tasks. Particularly useful in contexts with limited initial data.

🔍

Inga resultat hittades