Słownik AI
Kompletny słownik sztucznej inteligencji
Vector Reward Function
A return function that returns a vector of rewards instead of a scalar, allowing for the simultaneous capture of multiple conflicting objectives in reinforcement learning.
Multi-Objective Policy Optimization
The process of simultaneously optimizing multiple policies or a single policy aimed at optimizing several value functions corresponding to different objectives.
Continuous Action Space RL
A reinforcement learning paradigm where the agent can choose from an infinite set of continuous actions, requiring adapted optimization algorithms like PPO or SAC.
Preference-based RL
An approach where human preferences on trade-offs between objectives are integrated into the learning process to guide the agent towards desirable solutions on the Pareto front.
Convex Pareto Front
A Pareto front exhibiting mathematical convexity, allowing the use of linear scalarization methods to find all optimal solutions.
Weighted Sum Method
A scalarization technique that weights each objective with a coefficient to create a scalar objective function, simple but limited to convex Pareto fronts.
Chebyshev Scalarization
A scalarization method using the Chebyshev norm to guarantee the discovery of Pareto-optimal solutions even on non-convex fronts.
Nash Equilibrium in MORL
An equilibrium point where no agent can improve its position by unilaterally changing its strategy, applied to multi-objective games with continuous actions.
Dynamic Weighting
Adaptive strategy that modifies the weights of objectives during learning to efficiently explore the Pareto front and avoid local optima.
Non-dominated Solutions
Set of solutions where none is strictly better than another on all objectives, constituting the set of Pareto-optimal solutions.
Lexicographic Ordering
Hierarchical approach where objectives are optimized sequentially by order of absolute priority, without compromise between objectives of different ranks.
Stochastic Multi-Objective Policies
Probabilistic policies in continuous action spaces that simultaneously optimize multiple objectives, often implemented as parameterized Gaussian distributions.
Continuous Pareto Optimization
Continuous optimization of the Pareto front during learning, allowing the agent to dynamically adapt its trade-offs between objectives.
Multi-Objective Actor-Critic
Algorithmic architecture combining actor and critic adapted to multi-objective problems, with vectorial value functions and multi-objective policies.
Action Space Decomposition
Technique dividing the continuous action space into specialized subspaces for each objective, facilitating multi-objective optimization in complex environments.
Multi-Objective Exploration-Exploitation
Dilemma extended to multi-objective problems where exploration must aim to discover diverse optimal trade-offs rather than a single optimal solution.