Imitation with Partial Observations - 인공지능 용어집

📖

용어

Partial Observations

Scenario where demonstrations cover only a limited portion of the state space, creating unexplored areas that the agent must generalize.

📖

용어

Robust Policy

A learning policy designed to maintain acceptable performance when faced with partial observations and states not seen during training.

📖

용어

Policy Inference

Process of estimating the expert's underlying policy from a limited set of partial demonstration trajectories.

📖

용어

Policy Generalization

The ability of a learned policy to perform correctly in states not observed during the demonstrations, crucial for partial observations.

📖

용어

State Reconstruction

Technique for estimating missing or unobserved states from the partial information available in the demonstrations.

📖

용어

Covered State Space

The subset of the total state space actually explored in the demonstrations, defining the limits of direct imitation learning.

📖

용어

Learning from Demonstration

Synonym for imitation learning, specifically applied to scenarios where demonstrations are incomplete or noisy.

📖

용어

Out-of-Distribution Evaluation

Methodology for evaluating the policy's performance on states not present in the training data to measure its robustness.

📖

용어

Policy Function

Mathematical mapping π(a|s) that specifies the probability of choosing action a in state s, learned from partial demonstrations.

📖

용어

State Distribution

Probabilistic distribution describing the frequency of occurrence of different states in the environment, often biased in partial demonstrations.

AI 용어집