🏠 홈
벤치마크
📊 모든 벤치마크 🦖 공룡 v1 🦖 공룡 v2 ✅ 할 일 목록 앱 🎨 창의적인 자유 페이지 🎯 FSACB - 궁극의 쇼케이스 🌍 번역 벤치마크
모델
🏆 톱 10 모델 🆓 무료 모델 📋 모든 모델 ⚙️ 킬로 코드 모드
리소스
💬 프롬프트 라이브러리 📖 AI 용어 사전 🔗 유용한 링크

AI 용어집

인공지능 완전 사전

162
카테고리
2,032
하위 카테고리
23,060
용어
📖
용어

Binary Feedback

Type of feedback where only a positive/negative indication is observed after each action, without information on the reward magnitude. This format limits the information available to the learning algorithm.

📖
용어

Pairwise Feedback

Comparative information between two actions where only the winner is revealed, masking the absolute reward values. Used in recommendation systems where only relative preference is observable.

📖
용어

Noisy Feedback

Reward observations contaminated by random noise that degrades the quality of collected information. The noise may come from imperfect measurements or unpredictable user behaviors.

📖
용어

Censored Feedback

Situation where observed rewards are truncated at a certain maximum value, masking the true values beyond this threshold. Common in systems with technical or business constraints.

📖
용어

Truncated Feedback

Partial information where only knowledge of the rank or relative position of rewards is available, without their absolute values. Particularly used in ranking systems.

📖
용어

Exploration-Exploitation with Partial Feedback

Fundamental dilemma where the algorithm must balance discovery of new actions and exploitation of known best actions with incomplete information. Requires robust strategies facing increased uncertainty.

📖
용어

Contextual Bandits with Limited Feedback

Extension of bandits where actions depend on an observable context but with only partial information on rewards. Requires sophisticated estimation methods to manage contextual uncertainty.

📖
용어

Reward Distribution Estimation

Process of inferring the underlying reward distribution from partial or noisy observations. Fundamental for making optimal decisions under limited feedback.

📖
용어

Combinatorial Bandits with Partial Feedback

Problem where multiple actions are selected simultaneously but only aggregated information about their performance is available. Requires algorithms adapted to combinatorial complexity.

📖
용어

Linear Bandit with Noise

Model where rewards follow a linear combination of features but are observed with additive noise. Requires robust estimation techniques in the face of perturbations.

📖
용어

Adversarial Bandit with Limited Feedback

Setting where an adversary can manipulate rewards but the observer only accesses partial information about these manipulations. Demands robust adaptive strategies.

📖
용어

Aggregated Feedback

Cumulative information on the performance of a set of actions rather than on each individual action. Typical of systems with measurement or cost constraints.

📖
용어

Delayed Feedback

Situation where the reward of an action is only observed after a significant delay, creating temporal uncertainty. Complicates the attribution of rewards to appropriate actions.

📖
용어

Regret Bound with Partial Feedback

Theoretical analysis of the maximum achievable performance under limited information constraints. Provides guarantees on algorithm efficiency despite incomplete feedback.

🔍

결과를 찾을 수 없습니다