Inverse Reinforcement Learning

📖

thuật ngữ

Learning method where the agent infers the reward function from expert demonstrations rather than receiving explicit rewards.

📖

thuật ngữ

Maximum Entropy IRL

Variant of IRL that assumes the expert follows the maximum entropy probability distribution among all optimal policies.

📖

thuật ngữ

Behavioral Cloning

Supervised learning approach that directly learns to imitate expert actions without explicitly inferring the reward function.

📖

thuật ngữ

Expert Trajectory

Sequence of states and actions observed in an expert, representing an optimal or near-optimal solution to the problem.

📖

thuật ngữ

Policy Equivalence

Principle that multiple reward functions can lead to the same optimal policy, creating ambiguity in IRL.

📖

thuật ngữ

Bayesian Inverse Reinforcement Learning

IRL approach using Bayesian inference to estimate a distribution over possible reward functions.

📖

thuật ngữ

Preference Cost

Transformation of the reward function into a cost function, where the agent learns to minimize total cost while following demonstrations.

📖

thuật ngữ

Adversarial Inverse Reinforcement Learning

IRL method using an adversarial game where a generator learns the policy and a discriminator distinguishes expert trajectories.

📖

thuật ngữ

Active Inverse Reinforcement Learning

Variant of IRL where the agent can query the expert to obtain additional demonstrations and reduce uncertainty.

📖

thuật ngữ

Objective Function Inference

Mathematical process of deducing the underlying objective function from observations of the expert's behavior.

📖

thuật ngữ

Imitation Bias

Tendency of the agent to over-imitate the expert's actions without understanding the underlying intention, leading to poor generalizations.

📖

thuật ngữ

Reinforcement Learning with Expert Feedback

Combination of RL and IRL where a model first trains on expert data, then is refined with human feedback.

📖

thuật ngữ

Feature Function

Function that maps state-action pairs to a feature space, used to represent the reward function in a linear manner.

📖

thuật ngữ

Multi-task Inverse Reinforcement Learning

Extension of IRL where multiple tasks are learned simultaneously by sharing knowledge between reward functions.

Thuật ngữ AI