🏠 Ana Sayfa
Benchmarklar
📊 Tüm Benchmarklar 🦖 Dinozor v1 🦖 Dinozor v2 ✅ To-Do List Uygulamaları 🎨 Yaratıcı Serbest Sayfalar 🎯 FSACB - Nihai Gösteri 🌍 Çeviri Benchmarkı
Modeller
🏆 En İyi 10 Model 🆓 Ücretsiz Modeller 📋 Tüm Modeller ⚙️ Kilo Code
Kaynaklar
💬 Prompt Kütüphanesi 📖 YZ Sözlüğü 🔗 Faydalı Bağlantılar

YZ Sözlüğü

Yapay Zekanın tam sözlüğü

162
kategoriler
2.032
alt kategoriler
23.060
terimler
📖
terimler

Model-Based Offline RL

Offline reinforcement learning approach that learns a dynamic model of the environment to generate synthetic data and improve the policy without real interaction.

📖
terimler

Imagination Rollouts

Simulated trajectories generated using the learned model of the environment to explore potential future states without real interaction with the environment.

📖
terimler

Conservative Policy Optimization

Algorithm that explicitly penalizes policies that significantly deviate from the training data behavior to avoid extrapolation errors.

📖
terimler

Uncertainty Quantification

Technique to estimate the uncertainty of the dynamic model in out-of-distribution regions to guide exploration and avoid catastrophic errors.

📖
terimler

Ensemble Models

Collection of multiple dynamic models trained with different initializations to estimate epistemic uncertainty through prediction variance.

📖
terimler

Trajectory Transformers

Transformer architecture that models trajectories as sequences of states, actions, and rewards to predict future transitions in offline learning.

📖
terimler

Offline-to-Online Transfer

Process of transferring a policy learned offline to an online environment for refinement and continuous adaptation with real interaction.

📖
terimler

Model Ensembling

Technique using multiple dynamic models to capture different hypotheses about state transition and improve prediction robustness.

📖
terimler

Advantage Weighted Regression

Offline method that weights actions in training data according to their estimated advantage to improve policy beyond simple cloning.

📖
terimler

Out-of-Distribution Detection

Mechanism to identify when states generated by the model significantly deviate from the original training data distribution.

🔍

Sonuç bulunamadı