🏠 Trang chủ
Benchmark
📊 Tất cả benchmark 🦖 Khủng long v1 🦖 Khủng long v2 ✅ Ứng dụng To-Do List 🎨 Trang tự do sáng tạo 🎯 FSACB - Trình diễn cuối cùng 🌍 Benchmark dịch thuật
Mô hình
🏆 Top 10 mô hình 🆓 Mô hình miễn phí 📋 Tất cả mô hình ⚙️ Kilo Code
Tài nguyên
💬 Thư viện prompt 📖 Thuật ngữ AI 🔗 Liên kết hữu ích

Thuật ngữ AI

Từ điển đầy đủ về Trí tuệ nhân tạo

162
danh mục
2.032
danh mục con
23.060
thuật ngữ
📖
thuật ngữ

Multi-Objective Q-Learning

Extension of traditional Q-Learning algorithm that handles reward vectors instead of scalar values, enabling simultaneous optimization of multiple conflicting objectives.

📖
thuật ngữ

Q-value Vector

Multi-dimensional data structure where each element represents the Q-value for a specific objective, replacing the single scalar value of classical Q-Learning.

📖
thuật ngữ

Lexicographic Approach

Multi-objective resolution strategy where objectives are ordered by priority and optimized sequentially, each objective only being considered after complete optimization of higher priority objectives.

📖
thuật ngữ

Multi-objective Trade-off

Necessary balance between improving certain objectives and potential degradation of others, inherent to optimization problems with conflicting objectives.

📖
thuật ngữ

Weighted Q-value

Linear combination of individual Q-values from each objective using specific weights to reflect the relative importance of each objective in the final decision.

📖
thuật ngữ

Pareto Q-Learning Algorithm

Variant of Q-Learning that maintains a set of Pareto-optimal policies and simultaneously learns Q-values for all possible trade-offs between objectives.

📖
thuật ngữ

Multi-objective Exploration

Exploration strategy adapted to multi-objective environments that must balance the discovery of trade-offs between different objectives while maintaining learning efficiency.

📖
thuật ngữ

Nash Equilibrium in Q-Learning

Game theory concept applied to multi-objective Q-Learning where no policy can unilaterally improve its performance on one objective without degrading its performance on another.

📖
thuật ngữ

Objective Decomposition

Technique transforming a multi-objective problem into several single-objective subproblems optimized simultaneously, facilitating the discovery of diverse solutions on the Pareto front.

📖
thuật ngữ

Reward Vector

Multidimensional reward vector where each component corresponds to the reward associated with a specific objective, replacing the traditional scalar reward signal.

📖
thuật ngữ

Policy Space Adaptation

Dynamic adaptation mechanism of the policy space to efficiently manage the additional complexity introduced by the multi-objective nature of the learning problem.

🔍

Không tìm thấy kết quả