Słownik AI
Kompletny słownik sztucznej inteligencji
Hierarchical Reinforcement Learning (HRL)
Reinforcement learning paradigm structuring policies into multiple hierarchical levels where meta-policies control specialized sub-policies to solve complex tasks in a modular manner.
Options Framework
Formalism introduced by Sutton et al. generalizing atomic actions into temporal options consisting of a policy, an initiation condition, and an intra-temporal termination condition.
Meta-controller
High-level policy in HRL responsible for selecting and activating appropriate sub-policies based on global objectives and the current state of the environment.
Sub-controller
Low-level policy executing primitive actions or specific skills under the supervision of the meta-controller to accomplish localized sub-tasks.
Temporal Abstraction
Fundamental principle in HRL allowing to group action sequences into coherent temporal units (options) to reduce the temporal complexity of learning.
Feudal Reinforcement Learning
Hierarchical architecture inspired by feudal systems where high-level managers define goals for low-level workers who locally optimize their rewards.
MAXQ Framework
HRL approach decomposing the value of a hierarchical policy into additive contributions of sub-tasks, allowing for automatic and reusable problem decomposition.
Goal-conditioned Policies
Policies parameterized by specific goals allowing agents to learn generalizable behaviors that can be reused for different sub-objectives.
Intrinsic Motivation
Mechanism generating internal rewards based on novelty, curiosity or mastery to guide autonomous exploration of hierarchical skills.
Skill Discovery
Automatic process of identifying and extracting reusable behaviors (skills) from interaction with the environment without explicit external supervision.
Hierarchical Actor-Critic (HAC)
HRL architecture combining multi-level actor-critics where each level simultaneously learns a policy and a value function for its respective time horizon.
Hierarchical Deep Q-Network (hDQN)
Hierarchical extension of DQN using separate value networks for high and low-level policies, with pre-trained options as abstract actions.
State Abstraction
Technique reducing state dimensionality by grouping similar observations relevant for each hierarchical level, improving learning efficiency.
Termination Function
Function determining when an option should stop and return control to the upper level, crucial for temporal coordination between hierarchical levels.
Initiation Function
Function defining the conditions under which an option can be activated, ensuring that sub-policies only execute in appropriate states.
Policy over Options
High-level policy that selects among available options rather than primitive actions, forming the decision core of HRL systems.
Hindsight Experience Replay (HER)
Technique that augments past experiences by reinterpreting failures as successes for alternative goals, particularly effective in hierarchical frameworks.
Subgoal Discovery
Process of automatically identifying relevant intermediate states that serve as natural transition points between hierarchical decision-making levels.
Hierarchical Policy Gradient
Gradient optimization method adapted for hierarchical policies, propagating gradients through multiple decision levels simultaneously.
Option-Critic Architecture
End-to-end framework that simultaneously learns intra-option policies, terminations, and policies over options using gradient descent.