Glossario IA
Il dizionario completo dell'Intelligenza Artificiale
Encoder-Decoder
Bidirectional architecture where the encoder processes the input sequence and the decoder generates the output sequence. This structure enables transformation tasks such as machine translation or text summarization.
Masked Attention
Attention mechanism where certain positions are masked to prevent the model from accessing future information. Essential in decoders to ensure autoregressive generation during inference.
Position-wise Feed-Forward
Neural network applied identically and independently to each position in the sequence. Transforms representations after the attention mechanism by introducing non-linearity.
Attention Weight
Softmax-normalized scores that determine the relative importance of each element when computing attention. These weights are used to weight the linear combination of values.
Dropout Layer
Regularization technique that randomly deactivates neurons during training to prevent overfitting. Applied after attention and feed-forward layers in Transformers.