🏠 Startseite
Vergleiche
📊 Alle Benchmarks 🦖 Dinosaurier v1 🦖 Dinosaurier v2 ✅ To-Do-Listen-Apps 🎨 Kreative freie Seiten 🎯 FSACB - Ultimatives Showcase 🌍 Übersetzungs-Benchmark
Modelle
🏆 Top 10 Modelle 🆓 Kostenlose Modelle 📋 Alle Modelle ⚙️ Kilo Code
Ressourcen
💬 Prompt-Bibliothek 📖 KI-Glossar 🔗 Nützliche Links

KI-Glossar

Das vollständige Wörterbuch der Künstlichen Intelligenz

162
Kategorien
2.032
Unterkategorien
23.060
Begriffe
📖
Begriffe

Hierarchical Reinforcement Learning (HRL)

Reinforcement learning paradigm structuring policies into multiple hierarchical levels where meta-policies control specialized sub-policies to solve complex tasks in a modular manner.

📖
Begriffe

Options Framework

Formalism introduced by Sutton et al. generalizing atomic actions into temporal options consisting of a policy, an initiation condition, and an intra-temporal termination condition.

📖
Begriffe

Meta-controller

High-level policy in HRL responsible for selecting and activating appropriate sub-policies based on global objectives and the current state of the environment.

📖
Begriffe

Sub-controller

Low-level policy executing primitive actions or specific skills under the supervision of the meta-controller to accomplish localized sub-tasks.

📖
Begriffe

Temporal Abstraction

Fundamental principle in HRL allowing to group action sequences into coherent temporal units (options) to reduce the temporal complexity of learning.

📖
Begriffe

Feudal Reinforcement Learning

Hierarchical architecture inspired by feudal systems where high-level managers define goals for low-level workers who locally optimize their rewards.

📖
Begriffe

MAXQ Framework

HRL approach decomposing the value of a hierarchical policy into additive contributions of sub-tasks, allowing for automatic and reusable problem decomposition.

📖
Begriffe

Goal-conditioned Policies

Policies parameterized by specific goals allowing agents to learn generalizable behaviors that can be reused for different sub-objectives.

📖
Begriffe

Intrinsic Motivation

Mechanism generating internal rewards based on novelty, curiosity or mastery to guide autonomous exploration of hierarchical skills.

📖
Begriffe

Skill Discovery

Automatic process of identifying and extracting reusable behaviors (skills) from interaction with the environment without explicit external supervision.

📖
Begriffe

Hierarchical Actor-Critic (HAC)

HRL architecture combining multi-level actor-critics where each level simultaneously learns a policy and a value function for its respective time horizon.

📖
Begriffe

Hierarchical Deep Q-Network (hDQN)

Hierarchical extension of DQN using separate value networks for high and low-level policies, with pre-trained options as abstract actions.

📖
Begriffe

State Abstraction

Technique reducing state dimensionality by grouping similar observations relevant for each hierarchical level, improving learning efficiency.

📖
Begriffe

Termination Function

Function determining when an option should stop and return control to the upper level, crucial for temporal coordination between hierarchical levels.

📖
Begriffe

Initiation Function

Function defining the conditions under which an option can be activated, ensuring that sub-policies only execute in appropriate states.

📖
Begriffe

Policy over Options

High-level policy that selects among available options rather than primitive actions, forming the decision core of HRL systems.

📖
Begriffe

Hindsight Experience Replay (HER)

Technique that augments past experiences by reinterpreting failures as successes for alternative goals, particularly effective in hierarchical frameworks.

📖
Begriffe

Subgoal Discovery

Process of automatically identifying relevant intermediate states that serve as natural transition points between hierarchical decision-making levels.

📖
Begriffe

Hierarchical Policy Gradient

Gradient optimization method adapted for hierarchical policies, propagating gradients through multiple decision levels simultaneously.

📖
Begriffe

Option-Critic Architecture

End-to-end framework that simultaneously learns intra-option policies, terminations, and policies over options using gradient descent.

🔍

Keine Ergebnisse gefunden