🏠 홈
벤치마크
📊 모든 벤치마크 🦖 공룡 v1 🦖 공룡 v2 ✅ 할 일 목록 앱 🎨 창의적인 자유 페이지 🎯 FSACB - 궁극의 쇼케이스 🌍 번역 벤치마크
모델
🏆 톱 10 모델 🆓 무료 모델 📋 모든 모델 ⚙️ 킬로 코드 모드
리소스
💬 프롬프트 라이브러리 📖 AI 용어 사전 🔗 유용한 링크

AI 용어집

인공지능 완전 사전

162
카테고리
2,032
하위 카테고리
23,060
용어
📖
용어

Bandit Algorithm

Family of online learning algorithms where the agent must sequentially select actions with uncertain rewards to maximize cumulative gain.

📖
용어

Follow the Leader (FTL)

Online optimization strategy where the algorithm chooses at each step the action that would have been optimal on the observed past data up to that point.

📖
용어

Follow the Regularized Leader (FTRL)

Variant of FTL incorporating regularization to stabilize sequential decisions and guarantee better regret bounds in adversarial environments.

📖
용어

Online Gradient Descent

Optimization algorithm that updates model parameters in the direction opposite to the gradient of the loss function computed on each new observation.

📖
용어

Multiplicative Weights Update

Online optimization method that exponentially adjusts weights assigned to experts based on their past performance to combine their predictions.

📖
용어

Expert Advice

Online learning framework where the algorithm must aggregate recommendations from multiple experts to minimize regret relative to the best expert.

📖
용어

Online Convex Optimization

Mathematical theory studying sequential optimization of convex functions where loss functions are gradually revealed over time.

📖
용어

Adversarial Online Learning

Online learning scenario where data is generated by a potentially malicious adversary seeking to maximize the algorithm's regret.

📖
용어

Exploration-Exploitation Trade-off

Fundamental dilemma in online learning between exploring new actions to discover their rewards and exploiting actions known to be high-performing.

📖
용어

Online Mirror Descent

Generalization of gradient descent using a Bregman function to project updates into a constrained space, offering superior flexibility in optimization.

📖
용어

Learning with Partial Information

Paradigm where the algorithm only receives information about the chosen action (bandit) rather than all possible actions (full information).

📖
용어

Adaptive Learning Rate

Mechanism dynamically adjusting the learning step based on local properties of the loss landscape to optimize convergence in non-stationary environments.

📖
용어

Hedge Algorithm

Expert aggregation algorithm using multiplicative weight updates to guarantee a logarithmic regret bound relative to the best expert.

📖
용어

Regret Bound

Theoretical upper limit on the cumulative regret an algorithm may suffer, allowing comparison and performance guarantees for online optimization methods.

📖
용어

Stochastic Online Learning

Learning framework where data follows a fixed but unknown probability distribution, enabling performance guarantees in expectation rather than worst-case.

🔍

결과를 찾을 수 없습니다