KI-Glossar

Das vollständige Wörterbuch der Künstlichen Intelligenz

162

Kategorien

2.032

Unterkategorien

23.060

Begriffe

📖

Begriffe

Offline imitation learning

Learning paradigm where the agent learns to imitate expert behaviors without interacting with the environment, using only a fixed set of pre-recorded demonstrations.

📖

Begriffe

Demonstration set

Static collection of trajectories or expert action examples used as the sole source of information for offline imitation learning.

📖

Begriffe

Offline reinforcement learning

Reinforcement learning approach that uses only a pre-existing dataset without real-time interaction with the environment.

📖

Begriffe

Importance sampling

Statistical technique used to correct the discrepancy between the data distribution and target policy by weighting samples according to their relative probability.

📖

Begriffe

Distribution preservation

Constraint imposed on the learned policy to remain close to the demonstration distribution, thus avoiding risky extrapolations in unknown regions.

📖

Begriffe

Offline trajectory

Complete sequence of states, actions, and rewards recorded from an expert policy, constituting the basic unit of learning data.

📖

Begriffe

Expert policy

Reference strategy that generated the demonstrations, serving as a model to imitate and defining the desired optimal behavior.

📖

Begriffe

Offline estimator

Value or policy estimation algorithm specifically designed to work with static data without requiring interaction with the environment.

📖

Begriffe

Conservative bias correction

Bias correction approach that prioritizes safety by penalizing under-represented actions in the demonstration data.

📖

Begriffe

Constrained imitation learning

Method incorporating explicit constraints on the divergence between the learned policy and the data distribution to ensure stability.

📖

Begriffe

Transition set

Data structure storing tuples (state, action, next state, reward) extracted from expert trajectories for offline training.

📖

Begriffe

Adaptive importance weighting

Dynamic weighting technology that adjusts importance weights based on confidence in data quality in different regions of the state space.

📖

Begriffe

Coverage error

Measure quantifying the mismatch between the support of the data distribution and that of the optimal policy in offline learning.

🔍

KI-Glossar

Offline imitation learning

Demonstration set

Offline reinforcement learning

Importance sampling

Distribution preservation

Offline trajectory

Expert policy

Offline estimator

Conservative bias correction

Constrained imitation learning

Transition set

Adaptive importance weighting

Coverage error

Keine Ergebnisse gefunden