Seq2Seq (Sequence-to-Sequence)

📖

termini

Seq2Seq Architecture

Deep learning model composed of an encoder and a decoder designed to transform variable-length sequences into other sequences. This architecture is fundamentally used for machine translation, text summarization, and dialogue generation tasks.

📖

termini

Teacher Forcing

Training strategy where the decoder receives the true previous values as input rather than its own predictions, accelerating convergence. This technique stabilizes learning but can create a divergence between training and inference known as exposure bias.

📖

termini

Masking

Procedure consisting of masking certain positions of sequences to prevent the model from processing irrelevant or future information. Masking is essential for managing variable-length sequences and preventing cheating during auto-regressive training.

📖

termini

Embedding Vector

Dense vector representation of discrete tokens that captures semantic and syntactic relationships in a continuous space. Embeddings are learned during training and constitute the fundamental input of sequence processing models.

📖

termini

Gated Recurrent Unit

Simplified variant of LSTM using two gates (update and reset) to regulate information flow with fewer parameters. GRUs offer comparable performance to LSTM while being more computationally efficient.

📖

termini

Greedy Search

Decoding strategy that systematically selects the token with the highest probability at each generation step. Although fast, this approach can lead to suboptimal solutions as it does not consider alternative sequences.

📖

termini

Bidirectionality

Ability of the encoder to process the input sequence in both directions (forward and backward) to capture the complete context. Bidirectional encoders improve semantic understanding by considering both past and future context.

📖

termini

Subword Embeddings

Tokenization technique that divides words into smaller morphological units, allowing to handle rare words and open vocabulary. Subword embeddings like BPE or WordPiece have become the standard in modern models.

Glossario IA

Seq2Seq Architecture

Teacher Forcing

Masking

Embedding Vector

Gated Recurrent Unit

Greedy Search

Bidirectionality

Subword Embeddings

Nessun risultato trovato