Monte Carlo Tree Search Planning

📖

termer

Monte Carlo Tree Search (MCTS)

Heuristic search algorithm used for making decisions in decision processes, building a partial search tree by relying on random simulations to evaluate the potential of nodes.

📖

termer

Tree Search Planning

Process that involves using a tree structure to explore possible future action sequences, in order to determine the best policy to follow from a given state.

📖

termer

Learned Transition Model

Function or neural network trained to predict the next state of the environment based on the current state and the chosen action, used to simulate the branches of the search tree.

📖

termer

Upper Confidence Bound (UCB1)

Exploration-exploitation balancing formula used in the selection phase of MCTS to choose the most promising child node, favoring actions with high average value and that are little explored.

📖

termer

Node Expansion

Phase of MCTS where a new child node is added to the search tree from a selected node, representing a not-yet-explored state-action.

📖

termer

State Representation

Encoding of the environment's state, often in the form of a tensor or vector, that serves as input to the transition model and the reward model for planning.

📖

termer

Imagination Augmented Agents (I2A)

Agent architecture that integrates a MCTS-based planning module with learned models, allowing the agent to imagine and evaluate the future consequences of its actions before making a decision.

📖

termer

Value-Guided Tree Search

Variant of MCTS where the simulation (rollout) phase is replaced by the direct use of a value neural network to estimate the return of a node, thus speeding up the search.

📖

termer

Probability distribution over possible actions from the root state, often derived from a neural network, which can be used to bias the MCTS selection phase and accelerate convergence towards optimal actions.

📖

termer

Online Planning

An approach where the tree search is performed at each time step, starting from the current state, to determine the best immediate action, as opposed to pre-computed offline planning.

📖

termer

Asymmetric Search Tree

A characteristic of MCTS where the tree grows non-uniformly, deepening the most promising branches while ignoring others, making it very efficient for large action spaces.

📖

termer

Model-Based Reinforcement Learning (Model-Based RL)

An AI paradigm where an agent learns a model of its environment and then uses this model in a planning process, like MCTS, to improve its policy without requiring real interactions with the environment for each update.

AI-ordlista

Monte Carlo Tree Search (MCTS)

Tree Search Planning

Learned Transition Model

Upper Confidence Bound (UCB1)

Node Expansion

State Representation

Imagination Augmented Agents (I2A)

Value-Guided Tree Search

Root Policy Function

Online Planning

Asymmetric Search Tree

Model-Based Reinforcement Learning (Model-Based RL)

Inga resultat hittades