Planning by Reinforcement Learning

📖

termer

Planning Policy

Function or strategy that maps each environment state to a specific action, defining the agent's behavior to achieve optimal planning objectives.

📖

termer

Reward Shaping

Reward design technique that modifies the original reward function to more effectively guide the agent toward desired planning behaviors.

📖

termer

Hierarchical RL Planning

Approach where the planning policy is decomposed into a hierarchy of sub-tasks or sub-policies, enabling more efficient resolution of complex planning problems.

📖

termer

Meta-Learning for Planning

Paradigm where the agent learns to learn adaptive planning policies that can quickly adjust to new environments or planning objectives.

📖

termer

Multi-Agent RL Planning

Extension of RL to scenarios where multiple agents simultaneously learn planning policies, requiring consideration of interactions and cooperation/competition between agents.

📖

termer

Robust RL Planning

Approach aiming to learn planning policies that maintain their performance in the face of uncertainties and variations in the environment or dynamics model.

📖

termer

Transfer Learning in RL Planning

Technique enabling the reuse of knowledge or policies learned in one planning context to accelerate learning in a new similar context.

📖

termer

Constrained RL Planning

RL formulation where the agent must optimize its planning policy while respecting safety, resource, or other domain-specific constraints.

📖

termer

Apprentissage par Renforcement Basé sur le Modèle (Model-Based RL)

Approche où l'agent apprend ou utilise un modèle explicite de la dynamique de l'environnement pour améliorer sa planification et sa prise de décision, contrairement au RL sans modèle.

📖

termer

Planification Continue par RL (Continuous RL Planning)

Spécialisation du RL pour les problèmes de planification où les espaces d'états et d'actions sont continus, nécessitant des techniques d'approximation spécifiques comme les acteurs-critiques.

📖

termer

Épisode de Planification (Planning Episode)

Séquence complète d'interactions entre l'agent et l'environnement depuis un état initial jusqu'à un état terminal, constituant une unité d'apprentissage pour la politique de planification.

📖

termer

Planification par RL Apprentissage par Imitation (Imitation Learning for RL Planning)

Méthode où l'agent apprend une politique de planification en imitant des démonstrations d'experts, souvent utilisée pour initialiser ou guider l'apprentissage par renforcement.

📖

termer

Optimisation de Politique par RL (Policy Optimization)

Classe d'algorithmes RL qui optimisent directement les paramètres de la politique de planification pour maximiser la récompense attendue, incluant des méthodes comme REINFORCE ou PPO.

AI-ordlista

Planning Policy

Reward Shaping

Hierarchical RL Planning

Meta-Learning for Planning

Multi-Agent RL Planning

Robust RL Planning

Transfer Learning in RL Planning

Constrained RL Planning

Apprentissage par Renforcement Basé sur le Modèle (Model-Based RL)

Planification Continue par RL (Continuous RL Planning)

Épisode de Planification (Planning Episode)

Planification par RL Apprentissage par Imitation (Imitation Learning for RL Planning)

Optimisation de Politique par RL (Policy Optimization)

Inga resultat hittades