AI 용어집
인공지능 완전 사전
Sequence Modeling
Approach that formalizes reinforcement learning as a sequence modeling problem, where states, actions, and rewards are treated as tokens in a temporal sequence.
Temporal Difference Transformer
Transformer variant that incorporates temporal difference principles into the attention architecture, combining sequential learning and bootstrap updating of value estimates.
Trajectory Conditioning
Technique where the trajectory generator is conditioned on partial trajectory segments or specific goals, enabling precise control of the generated behavior.
Multi-step Prediction
Capability of transformer models to predict multiple future steps of a trajectory simultaneously, improving long-term consistency of generated state-action-reward sequences.
Distributional RL
Extension of reinforcement learning that models the complete distribution of returns rather than just their expectation, capturing uncertainty in trajectory predictions.
Attention-based Trajectory Embedding
Vector representation of trajectories obtained through attention mechanisms, capturing complex temporal dependencies between successive states, actions, and rewards.