🏠 Ana Sayfa
Benchmarklar
📊 Tüm Benchmarklar 🦖 Dinozor v1 🦖 Dinozor v2 ✅ To-Do List Uygulamaları 🎨 Yaratıcı Serbest Sayfalar 🎯 FSACB - Nihai Gösteri 🌍 Çeviri Benchmarkı
Modeller
🏆 En İyi 10 Model 🆓 Ücretsiz Modeller 📋 Tüm Modeller ⚙️ Kilo Code
Kaynaklar
💬 Prompt Kütüphanesi 📖 YZ Sözlüğü 🔗 Faydalı Bağlantılar

YZ Sözlüğü

Yapay Zekanın tam sözlüğü

162
kategoriler
2.032
alt kategoriler
23.060
terimler
📖
terimler

Vector Reward Function

A return function that returns a vector of rewards instead of a scalar, allowing for the simultaneous capture of multiple conflicting objectives in reinforcement learning.

📖
terimler

Multi-Objective Policy Optimization

The process of simultaneously optimizing multiple policies or a single policy aimed at optimizing several value functions corresponding to different objectives.

📖
terimler

Continuous Action Space RL

A reinforcement learning paradigm where the agent can choose from an infinite set of continuous actions, requiring adapted optimization algorithms like PPO or SAC.

📖
terimler

Preference-based RL

An approach where human preferences on trade-offs between objectives are integrated into the learning process to guide the agent towards desirable solutions on the Pareto front.

📖
terimler

Convex Pareto Front

A Pareto front exhibiting mathematical convexity, allowing the use of linear scalarization methods to find all optimal solutions.

📖
terimler

Weighted Sum Method

A scalarization technique that weights each objective with a coefficient to create a scalar objective function, simple but limited to convex Pareto fronts.

📖
terimler

Chebyshev Scalarization

A scalarization method using the Chebyshev norm to guarantee the discovery of Pareto-optimal solutions even on non-convex fronts.

📖
terimler

Nash Equilibrium in MORL

An equilibrium point where no agent can improve its position by unilaterally changing its strategy, applied to multi-objective games with continuous actions.

📖
terimler

Dynamic Weighting

Adaptive strategy that modifies the weights of objectives during learning to efficiently explore the Pareto front and avoid local optima.

📖
terimler

Non-dominated Solutions

Set of solutions where none is strictly better than another on all objectives, constituting the set of Pareto-optimal solutions.

📖
terimler

Lexicographic Ordering

Hierarchical approach where objectives are optimized sequentially by order of absolute priority, without compromise between objectives of different ranks.

📖
terimler

Stochastic Multi-Objective Policies

Probabilistic policies in continuous action spaces that simultaneously optimize multiple objectives, often implemented as parameterized Gaussian distributions.

📖
terimler

Continuous Pareto Optimization

Continuous optimization of the Pareto front during learning, allowing the agent to dynamically adapt its trade-offs between objectives.

📖
terimler

Multi-Objective Actor-Critic

Algorithmic architecture combining actor and critic adapted to multi-objective problems, with vectorial value functions and multi-objective policies.

📖
terimler

Action Space Decomposition

Technique dividing the continuous action space into specialized subspaces for each objective, facilitating multi-objective optimization in complex environments.

📖
terimler

Multi-Objective Exploration-Exploitation

Dilemma extended to multi-objective problems where exploration must aim to discover diverse optimal trade-offs rather than a single optimal solution.

🔍

Sonuç bulunamadı