AI-ordlista
Den kompletta ordlistan över AI
Constrained Reinforcement Learning
Learning paradigm where the agent optimizes a primary objective while ensuring compliance with constraints defined on states, actions, or cumulative rewards.
Constraint Function
Mathematical function quantifying constraint violations in the environment, typically expressed as an expectation over trajectories that must remain below a predefined threshold.
Augmented Lagrangian
Optimization method combining Lagrange multipliers and quadratic penalty terms to effectively manage constraints in reinforcement learning.
Interior Point Method
Optimization algorithm navigating within the feasible domain by using barrier functions to strictly maintain constraint satisfaction during learning.
Constrained Policy Optimization
Reinforcement learning algorithm adapting policy optimization to maximize rewards under specified cost or safety constraints.
Constrained Value Function
Extension of Q and V value functions integrating constraints as additional objectives, allowing for simultaneous evaluation of performance and constraint adherence.
Feasible Policy Set
Space of policies that satisfy all specified constraints, forming the search domain in which the algorithm must identify the optimal policy.
Lagrange Multipliers
Scalar variables associated with each constraint in the dual formulation, dynamically adjusted to balance objective optimization and constraint satisfaction.
Constraint Satisfaction
Fundamental property guaranteeing the existence of at least one policy that respects all imposed constraints in the reinforcement learning problem.
Projection Method
Technique that iteratively projects policy updates onto the set of admissible policies to maintain constraints during optimization.
Safe Reinforcement Learning
Subfield of constrained RL focusing on maintaining the safety of the agent during exploration, typically through constraints on critical states.
Logarithmic Barrier Method
Optimization approach adding penalty terms that tend to infinity near constraint boundaries, forcing the agent to remain strictly within the admissible domain.
Biconvex Optimization
Optimization problem where the objective function is convex with respect to policy variables and Lagrange multipliers separately, but not jointly.
Duality in Reinforcement Learning
Mathematical principle transforming a constrained problem into an unconstrained problem via Lagrange multipliers, facilitating optimization while ensuring feasibility.
Penalty Methods
Family of algorithms incorporating constraint violations into the objective function through penalty terms, transforming the constrained problem into unconstrained optimization.
Trust Region
Region around the current policy where local approximations are considered valid, limiting updates to respect stability and performance constraints.
Constrained Dynamic Programming
Extension of dynamic programming incorporating constraints on cumulative rewards, requiring modifications to standard Bellman equations.
Fallback Policy
Predefined policy ensuring constraint compliance when the main policy risks violating them, used as a safety mechanism in critical systems.
Constraint Sensitivity Analysis
Study of the impact of variations in constraint thresholds on the optimal policy, allowing fine-tuning of trade-offs between performance and safety.
Constraint Regularization
Technique adding regularization terms based on constraints to stabilize learning and avoid extreme solutions that marginally violate limitations.