Autoregressive Models - Glosarium AI

📖

istilah

Autoregressive Model

Generative model architecture that predicts the next token based on all previous tokens, building the sequence iteratively and sequentially.

📖

istilah

Context Window

Maximum sequence size that the model can process simultaneously, limiting the amount of historical information usable for prediction.

📖

istilah

Next Token Prediction

Fundamental objective of autoregressive models consisting of maximizing the conditional probability P(token_t|tokens_1...t-1).

📖

istilah

Temperature Sampling

Generation technique controlling the degree of randomness in the selection of the next token by adjusting the probability distribution of logits.

📖

istilah

Top-k Sampling

Generation method limiting selection to the k most probable tokens, avoiding low-probability tokens while maintaining diversity.

📖

istilah

Nucleus Sampling

Dynamic selection strategy based on cumulative probability mass, adapting the number of candidates according to the model's confidence.

📖

istilah

Beam Search

Decoding algorithm simultaneously exploring multiple candidate sequences to find the most probable global sequence.

📖

istilah

Causal Language Model

Type of autoregressive model trained to predict future tokens based on past context, without access to future tokens during training.

📖

istilah

Transformer Decoder-only

Neural architecture using only decoder layers with causal masking, preferred for modern autoregressive language models.

📖

istilah

Greedy Decoding

Generation strategy systematically selecting the token with maximum probability at each step, ensuring consistency but potentially lacking creativity.

📖

istilah

Autoregressive Generation

Text generation process where each produced token is immediately added to the context to influence the generation of subsequent tokens.

📖

istilah

Language Model Fine-tuning

Process of specialized adaptation of a pre-trained autoregressive model on specific data to improve its performance in a targeted domain.

📖

istilah

Zero-shot Learning

Ability of autoregressive models to accomplish tasks not seen during training by leveraging their general language knowledge.

📖

istilah

KV Cache

Optimization mechanism storing key-value states of previous tokens to accelerate sequential autoregressive generation.

📖

istilah

Variable Sequence Length

Ability of autoregressive models to generate sequences of different lengths dynamically adapted according to the generated content.

Glosarium AI