Continuous Multi-Objective Reinforcement Learning

📖

terimler

Vector Reward Function

A return function that returns a vector of rewards instead of a scalar, allowing for the simultaneous capture of multiple conflicting objectives in reinforcement learning.

📖

terimler

Multi-Objective Policy Optimization

The process of simultaneously optimizing multiple policies or a single policy aimed at optimizing several value functions corresponding to different objectives.

📖

terimler

Continuous Action Space RL

A reinforcement learning paradigm where the agent can choose from an infinite set of continuous actions, requiring adapted optimization algorithms like PPO or SAC.

📖

terimler

Preference-based RL

An approach where human preferences on trade-offs between objectives are integrated into the learning process to guide the agent towards desirable solutions on the Pareto front.

📖

terimler

Convex Pareto Front

A Pareto front exhibiting mathematical convexity, allowing the use of linear scalarization methods to find all optimal solutions.

📖

terimler

Weighted Sum Method

A scalarization technique that weights each objective with a coefficient to create a scalar objective function, simple but limited to convex Pareto fronts.

📖

terimler

Chebyshev Scalarization

A scalarization method using the Chebyshev norm to guarantee the discovery of Pareto-optimal solutions even on non-convex fronts.

📖

terimler

Nash Equilibrium in MORL

An equilibrium point where no agent can improve its position by unilaterally changing its strategy, applied to multi-objective games with continuous actions.

📖

terimler

Dynamic Weighting

Adaptive strategy that modifies the weights of objectives during learning to efficiently explore the Pareto front and avoid local optima.

📖

terimler

Non-dominated Solutions

Set of solutions where none is strictly better than another on all objectives, constituting the set of Pareto-optimal solutions.

📖

terimler

Lexicographic Ordering

Hierarchical approach where objectives are optimized sequentially by order of absolute priority, without compromise between objectives of different ranks.

📖

terimler

Stochastic Multi-Objective Policies

Probabilistic policies in continuous action spaces that simultaneously optimize multiple objectives, often implemented as parameterized Gaussian distributions.

📖

terimler

Continuous Pareto Optimization

Continuous optimization of the Pareto front during learning, allowing the agent to dynamically adapt its trade-offs between objectives.

📖

terimler

Multi-Objective Actor-Critic

Algorithmic architecture combining actor and critic adapted to multi-objective problems, with vectorial value functions and multi-objective policies.

📖

terimler

Action Space Decomposition

Technique dividing the continuous action space into specialized subspaces for each objective, facilitating multi-objective optimization in complex environments.

📖

terimler

Multi-Objective Exploration-Exploitation

Dilemma extended to multi-objective problems where exploration must aim to discover diverse optimal trade-offs rather than a single optimal solution.

YZ Sözlüğü

Vector Reward Function

Multi-Objective Policy Optimization

Continuous Action Space RL

Preference-based RL

Convex Pareto Front

Weighted Sum Method

Chebyshev Scalarization

Nash Equilibrium in MORL

Dynamic Weighting

Non-dominated Solutions

Lexicographic Ordering

Stochastic Multi-Objective Policies

Continuous Pareto Optimization

Multi-Objective Actor-Critic

Action Space Decomposition

Multi-Objective Exploration-Exploitation

Sonuç bulunamadı