AutoInt

📖

terms

Deep neural network architecture designed to automatically model high-order feature interactions in tabular data, using a multi-head attention mechanism.

📖

terms

Multi-Head Attention Mechanism

Module that allows the model to simultaneously focus on different positions in the input sequence, learning multiple attention representations in parallel to capture complex dependencies.

📖

terms

High-Order Feature Interaction

Non-linear combination of three or more input variables, whose capture is essential to improve the predictive power of models on complex structured data.

📖

terms

Feature Embedding

Dense and low-dimensional vector representation of categorical features, allowing the model to treat these variables as continuous inputs and learn their semantic relationships.

📖

terms

Interaction Network

Subnetwork within AutoInt responsible for explicitly calculating interactions between feature embedding vectors before applying the attention mechanism.

📖

terms

Attention Value

Weighted score calculated by the attention mechanism that quantifies the importance of a specific feature or interaction for the model's final prediction.

📖

terms

Attention Pooling

Aggregation operation that uses attention weights to combine feature representations, producing a context vector that highlights the most relevant information.

📖

terms

Automatic Interaction Learning

Paradigm where the model discovers and hierarchizes relevant feature interactions itself without requiring manual engineering or a priori specification.

📖

terms

Query Vector

In the attention mechanism, a vector that represents the current state of the model and is used to calculate the compatibility score with each key vector.

📖

terms

Key Vector

Representation of a feature or candidate interaction, used to be compared with the query vector to determine its attention level.

📖

terms

Value Vector

Vector containing the actual information of a feature, which is weighted by the attention score and aggregated to form the output of the attention mechanism.

📖

terms

Scaled Dot Product

Similarity function used in attention to calculate scores, where the dot product is divided by the square root of the vector dimension to stabilize training.

📖

terms

Residual and Layer Normalization

Architecture technique where the output of a layer is added to its input (residual connection) and then normalized, facilitating the training of deep networks like AutoInt.

📖

terms

Cross Interaction

Specific operation in AutoInt that calculates interactions between feature pairs using element-wise multiplication followed by linear transformation.

📖

terms

Attention Head

One of the multiple attention mechanisms working in parallel in a multi-head module, each learning to focus on different aspects of feature interactions.

📖

terms

Head Aggregation

Process of concatenating or averaging the outputs of all attention heads to form a unified representation before passing it to the next layer.

📖

terms

Long-Range Dependency Modeling

Ability of attention mechanism to capture relationships between distant features in embedding space, surpassing the limitations of local models like CNNs.

📖

terms

Attention Map Interpretability

Method to visualize and understand model decisions by analyzing attention weights, revealing which feature interactions were most influential.

AI Glossary

AutoInt

Multi-Head Attention Mechanism

High-Order Feature Interaction

Feature Embedding

Interaction Network

Attention Value

Attention Pooling

Automatic Interaction Learning

Query Vector

Key Vector

Value Vector

Scaled Dot Product

Residual and Layer Normalization

Cross Interaction

Attention Head

Head Aggregation

Long-Range Dependency Modeling

Attention Map Interpretability

No results found