🏠 Hem
Benchmarkar
📊 Alla benchmarkar 🦖 Dinosaur v1 🦖 Dinosaur v2 ✅ To-Do List-applikationer 🎨 Kreativa fria sidor 🎯 FSACB - Ultimata uppvisningen 🌍 Översättningsbenchmark
Modeller
🏆 Topp 10 modeller 🆓 Gratis modeller 📋 Alla modeller ⚙️ Kilo Code
Resurser
💬 Promptbibliotek 📖 AI-ordlista 🔗 Användbara länkar

AI-ordlista

Den kompletta ordlistan över AI

162
kategorier
2 032
underkategorier
23 060
termer
📖
termer

Inverse Reinforcement Learning

Method that consists of deducing the reward function of an expert from its optimal trajectories, then allowing the agent to learn an optimal policy.

📖
termer

State-only Imitation Learning

Learning paradigm where the agent only has access to the states visited by the expert without knowledge of the actions taken, requiring specific approaches to infer behaviors.

📖
termer

Trajectory Matching

Approach that minimizes the divergence between the trajectory distributions generated by the agent and those of the expert, often used in learning without access to actions.

📖
termer

GAIL

Framework combining imitation learning and generative adversarial networks, where a discriminator distinguishes the trajectories of the expert from those of the agent.

📖
termer

Dataset Aggregation

Iterative algorithm that collects new expert data based on the errors of the current agent, progressively aggregating a more robust dataset.

📖
termer

Forward-Forward Algorithm

Unsupervised learning method that predicts future states from current states without requiring action data, used in imitation by observation.

📖
termer

Observation-based Learning

Learning process where the agent acquires skills by observing only environmental states and results, without direct access to the expert's actions.

📖
termer

State Distribution Matching

Technique aiming to align the distribution of states visited by the agent with that of the expert, used when actions are not observable.

📖
termer

No-action Imitation

A form of imitation learning where the agent must learn to reproduce expert behavior without any information about the actions taken.

📖
termer

Passive Learning

Learning mode where the agent passively observes demonstrations without active interaction with the environment, typical of imitation by observation.

📖
termer

Expert Demonstration

Set of trajectories or states provided by an expert serving as reference for imitation learning, crucial in approaches without access to actions.

📖
termer

State-Action Distribution

Joint distribution of states and actions that the agent seeks to approximate, often inferred from the state distribution alone in imitation by observation.

📖
termer

Trajectory-based Learning

Learning approach that focuses on reproducing complete trajectories rather than individual state-action decisions, adapted to observation without actions.

📖
termer

Dynamics Model

Model learning the transition between consecutive states in expert demonstrations, essential for inferring actions when they are not observed.

📖
termer

Occupancy Measure

Statistical measure quantifying the visitation frequency of each state-action, adapted to contexts where only state visitations are observable.

🔍

Inga resultat hittades