🏠 홈
벤치마크
📊 모든 벤치마크 🦖 공룡 v1 🦖 공룡 v2 ✅ 할 일 목록 앱 🎨 창의적인 자유 페이지 🎯 FSACB - 궁극의 쇼케이스 🌍 번역 벤치마크
모델
🏆 톱 10 모델 🆓 무료 모델 📋 모든 모델 ⚙️ 킬로 코드 모드
리소스
💬 프롬프트 라이브러리 📖 AI 용어 사전 🔗 유용한 링크

AI 용어집

인공지능 완전 사전

162
카테고리
2,032
하위 카테고리
23,060
용어
📖
용어

Model-Based Offline RL

Offline reinforcement learning approach that learns a dynamic model of the environment to generate synthetic data and improve the policy without real interaction.

📖
용어

Imagination Rollouts

Simulated trajectories generated using the learned model of the environment to explore potential future states without real interaction with the environment.

📖
용어

Conservative Policy Optimization

Algorithm that explicitly penalizes policies that significantly deviate from the training data behavior to avoid extrapolation errors.

📖
용어

Uncertainty Quantification

Technique to estimate the uncertainty of the dynamic model in out-of-distribution regions to guide exploration and avoid catastrophic errors.

📖
용어

Ensemble Models

Collection of multiple dynamic models trained with different initializations to estimate epistemic uncertainty through prediction variance.

📖
용어

Trajectory Transformers

Transformer architecture that models trajectories as sequences of states, actions, and rewards to predict future transitions in offline learning.

📖
용어

Offline-to-Online Transfer

Process of transferring a policy learned offline to an online environment for refinement and continuous adaptation with real interaction.

📖
용어

Model Ensembling

Technique using multiple dynamic models to capture different hypotheses about state transition and improve prediction robustness.

📖
용어

Advantage Weighted Regression

Offline method that weights actions in training data according to their estimated advantage to improve policy beyond simple cloning.

📖
용어

Out-of-Distribution Detection

Mechanism to identify when states generated by the model significantly deviate from the original training data distribution.

🔍

결과를 찾을 수 없습니다