🏠 Strona Główna
Benchmarki
📊 Wszystkie benchmarki 🦖 Dinozaur v1 🦖 Dinozaur v2 ✅ Aplikacje To-Do List 🎨 Kreatywne wolne strony 🎯 FSACB - Ostateczny pokaz 🌍 Benchmark tłumaczeń
Modele
🏆 Top 10 modeli 🆓 Darmowe modele 📋 Wszystkie modele ⚙️ Kilo Code
Zasoby
💬 Biblioteka promptów 📖 Słownik AI 🔗 Przydatne linki

Słownik AI

Kompletny słownik sztucznej inteligencji

162
kategorie
2 032
podkategorie
23 060
pojęcia
📖
pojęcia

DAgger (Dataset Aggregation)

Imitation learning algorithm that iteratively collects data by querying an expert on states visited by the current policy. This approach reduces the gap between the training distribution and the deployment distribution.

📖
pojęcia

Data aggregation

Process of collecting and combining multiple datasets from different sources or learning iterations. In DAgger, this allows for progressively improving the robustness of the learned policy.

📖
pojęcia

Iterative collection

Methodology of gathering data performed in several successive cycles, with each cycle using information from previous cycles. This approach allows for continuously refining the policy and exploring new states.

📖
pojęcia

Behavioral policy

Strategy or probability distribution over actions that the agent follows during data collection in DAgger. It evolves across iterations to approach the optimal policy.

📖
pojęcia

State distribution

Probabilistic set of states that the agent is likely to target during its execution. DAgger seeks to align this distribution with that encountered in real deployment.

📖
pojęcia

Distribution bias

Difference between the training data distribution and that encountered during production deployment. DAgger reduces this bias by collecting data on states actually visited by the current policy.

📖
pojęcia

Error correction

Process by which an expert provides the correct actions when the current agent policy makes mistakes. These corrections serve as new training data to improve the policy.

📖
pojęcia

Expert querying

Mechanism for soliciting optimal actions from a human expert or system for specific states visited by the agent. These queries are essential for generating high-quality training data.

📖
pojęcia

Visited state

Specific configuration or situation of the environment that the agent reaches during the execution of its current policy. These states become query points for the expert in DAgger.

📖
pojęcia

Current policy

Current version of the agent's decision-making strategy that evolves at each iteration of the DAgger algorithm. It is used to explore the environment and identify states requiring expert corrections.

📖
pojęcia

Adaptive aggregation

Variant of DAgger that dynamically adjusts the proportion of expert actions versus current policy actions. This adaptation helps balance exploration and exploitation during learning.

📖
pojęcia

Feedback loop

Continuous cycle where the performance of the current policy generates new states, which in turn require expert corrections. This iterative loop is the fundamental improvement mechanism in DAgger.

📖
pojęcia

Online correction

Expert intervention process that occurs during real-time execution of the agent's policy. These immediate corrections help prevent the propagation of errors in trajectories.

📖
pojęcia

Trajectory distribution

Set of state and action sequences that the agent generates by following its current policy. DAgger aims to align this distribution with that produced by the optimal expert policy.

📖
pojęcia

Target policy

Optimal policy that the agent seeks to imitate, typically represented by expert demonstrations. The goal of DAgger is to make the learned policy converge toward this target policy.

📖
pojęcia

Progressive aggregation

Data accumulation strategy where each new iteration adds complementary information to existing data. This approach ensures growing coverage of the relevant state space.

📖
pojęcia

Compaction error

Performance gap between the learned policy and the expert policy due to representation limitations. DAgger minimizes this error by collecting data on the true state distribution.

🔍

Nie znaleziono wyników