🏠 Trang chủ
Benchmark
📊 Tất cả benchmark 🦖 Khủng long v1 🦖 Khủng long v2 ✅ Ứng dụng To-Do List 🎨 Trang tự do sáng tạo 🎯 FSACB - Trình diễn cuối cùng 🌍 Benchmark dịch thuật
Mô hình
🏆 Top 10 mô hình 🆓 Mô hình miễn phí 📋 Tất cả mô hình ⚙️ Kilo Code
Tài nguyên
💬 Thư viện prompt 📖 Thuật ngữ AI 🔗 Liên kết hữu ích

Thuật ngữ AI

Từ điển đầy đủ về Trí tuệ nhân tạo

162
danh mục
2.032
danh mục con
23.060
thuật ngữ
📖
thuật ngữ

Centralized Q-learning

A variant of Q-learning where agents share a common Q-table to coordinate their actions in a cooperative environment. This approach allows learning an optimal joint policy by considering the global state of the system.

📖
thuật ngữ

Value Decomposition Networks (VDN)

A neural network architecture that decomposes the team value into the sum of individual values of each agent. This method maintains agent individuality while maximizing collective reward.

📖
thuật ngữ

QMIX

A multi-agent reinforcement learning algorithm that combines monotonic mixing networks with value decomposition. QMIX ensures consistency between individual values and the global value while allowing arbitrary complexity in the decomposition.

📖
thuật ngữ

Counterfactual Regret Minimization (CRM)

An optimization technique that minimizes counterfactual regret for each agent in cooperative games. It allows learning optimal team strategies by evaluating what the outcome would have been if an agent had acted differently.

📖
thuật ngữ

Commutative Monotonicity

A mathematical property essential in value decomposition algorithms where the order of agents does not affect the total value. This condition ensures that the collective reward is consistent regardless of agent permutation.

📖
thuật ngữ

Individual-Global-Max (IGM)

A fundamental principle stating that the global maximum of the team value function must be achieved when each agent chooses its maximum individual action. This property ensures consistency between local decisions and global optimality.

📖
thuật ngữ

Multi-Agent Deep Deterministic Policy Gradient (MADDPG)

An extension of DDPG to multi-agent environments using a centralized actor-critic approach during training. MADDPG allows agents to learn decentralized policies while having access to complete information during training.

📖
thuật ngữ

Centralized Training with Decentralized Execution (CTDE)

A learning paradigm where agents train with access to global information but execute decentralized policies during deployment. This approach combines the advantages of centralized coordination during learning with the robustness of distributed execution.

📖
thuật ngữ

Attention-based Communication

Communication mechanism between agents where each agent learns to selectively pay attention to relevant messages from other agents. This approach optimizes information flow and reduces computational complexity in agent teams.

📖
thuật ngữ

Mean Field Reinforcement Learning

Theoretical approach that models the behavior of a large number of agents as a mean field rather than individual interactions. This method allows scaling multi-agent learning to very large populations while capturing collective emergences.

📖
thuật ngữ

Team-Q

Q-learning algorithm extended to team environments where the Q function is defined on the joint actions of all agents. Team-Q allows learning optimal coordinated strategies in problems with discrete states.

📖
thuật ngữ

Distributed Q-learning

Variant of Q-learning where each agent maintains its own Q table but periodically shares information with other agents. This approach combines local autonomy with collective learning to achieve effective coordination.

📖
thuật ngữ

Decentralized Partially Observable Markov Decision Process (Dec-POMDP)

Mathematical formalism for modeling multi-agent decision problems with partial and decentralized observation. Dec-POMDPs capture the complexity of cooperative environments where each agent has only a limited view of the global state.

📖
thuật ngữ

Cooperative Inverse Reinforcement Learning

Extension of inverse reinforcement learning where multiple agents collaborate to infer the common reward function from examples. This approach allows agents to collectively learn which behavior maximizes common satisfaction.

📖
thuật ngữ

Shared Experience Replay

Technique where agents share a common experience buffer to improve learning efficiency. This method allows agents to learn faster by benefiting from the experiences of other team members.

📖
thuật ngữ

Multi-Agent Actor-Critic

Learning architecture combining decentralized actors with centralized critics for cooperative multi-agent environments. This approach allows actors to make local decisions based on global evaluations provided by critics.

🔍

Không tìm thấy kết quả