MARL Partially Observable

📖

termini

POMDP (Partially Observable Markov Decision Process)

Theoretical framework modeling environments where the agent perceives only a partial observation of the true state, requiring probabilistic inference about the hidden state to make optimal decisions.

📖

termini

Observation Space

Set of partial sensory signals that each agent can perceive from the environment, representing incomplete information about the global state of the system.

📖

termini

Belief State

Probability distribution over the hidden state space that an agent maintains and updates from its successive observations to represent its uncertainty about the true state of the environment.

📖

termini

Communication Protocol

Mechanism defining when, how, and what information agents can exchange among themselves to coordinate their actions in a partially observable environment.

📖

termini

Centralized Training with Decentralized Execution

Approach where agents train using global information (states, actions of all agents) but execute their policies individually using only their local observations.

📖

termini

Value Function Factorization

Technique decomposing the global value function into a sum of individual or local value functions, enabling decentralized learning while preserving global consistency.

📖

termini

Adversary Modeling

Process of inferring the policies or intentions of other agents based on their observed behaviors, crucial for decision-making in competitive or cooperative environments.

📖

termini

Credit Assignment Problem

Difficulty in correctly attributing the global reward to each agent in a multi-agent system, particularly complex when observations are partial and actions are interdependent.

📖

termini

Joint Action Learning

Method where agents learn to coordinate their actions by explicitly modeling the impact of combined actions on the global reward, despite partial observability.

📖

termini

State Estimation

Algorithmic process allowing an agent to infer the most probable global state from its local observations and its model of the environment.

📖

termini

Information Sharing

Strategy defining how agents distribute and aggregate their local observations to improve the collective knowledge of the environment's state.

📖

termini

Local Observation History

Temporal sequence of an agent's past observations, used as additional context to compensate for the lack of information about the current global state.

📖

termini

Multi-agent Partial Observability

Condition where no individual agent can observe the complete state of the system, requiring coordination and inference strategies to achieve optimal performance.

📖

termini

Decentralized Policy

Decision function for each agent that maps its local observation history to an action, without direct dependence on other agents' information during execution.

📖

termini

Common Knowledge

Information that all agents know and know that others also know, essential for coordination in partially observable environments.

📖

termini

Coordination Graph

Structure representing interaction dependencies between agents, allowing the global decision problem to be factored into easier-to-solve local subproblems.

Glossario IA

POMDP (Partially Observable Markov Decision Process)

Observation Space

Belief State

Communication Protocol

Centralized Training with Decentralized Execution

Value Function Factorization

Adversary Modeling

Credit Assignment Problem

Joint Action Learning

State Estimation

Information Sharing

Local Observation History

Multi-agent Partial Observability

Decentralized Policy

Common Knowledge

Coordination Graph

Nessun risultato trovato