GRU (Gated Recurrent Unit)

📖

termini

GRU (Gated Recurrent Unit)

Recurrent neural network architecture introduced in 2014 that simplifies the LSTM structure by combining the forget and input gates into a single update gate, thus reducing the number of parameters while preserving performance.

📖

termini

Update gate

Control mechanism of the GRU that determines how much information from the previous state should be retained and how much new information should be added to the current state.

📖

termini

Reset gate

Essential component of the GRU that controls how much of the previous state should be forgotten when calculating the new candidate state, allowing the model to reset its memory when necessary.

📖

termini

Hidden state

Memory vector in a GRU that encodes relevant information from previous time steps and is transmitted to each new step to maintain temporal context.

📖

termini

Candidate vector

Intermediate representation calculated in a GRU that contains potential new information to be added to the hidden state, weighted by the update gate.

📖

termini

Gating mechanism

Regulation system in GRUs using sigmoid gates to selectively control the flow of information through the network, simulating a form of selective memory.

📖

termini

Temporal propagation

Process by which GRUs process sequences by propagating information from one time step to another through their hidden states and gating mechanisms.

📖

termini

Gate parameters

Weight matrices and bias vectors specific to the update and reset gates in a GRU, learned during training to optimize the control of information flow.

📖

termini

Previous state

Value of the hidden state at time step t-1 used as input to calculate the new state at time step t in a GRU, essential for maintaining temporal continuity.

📖

termini

Information fusion

Operation in GRUs where the update gate linearly combines the previous state and the candidate vector to produce the new hidden state, weighting their relative importance.

📖

termini

Computational complexity

Advantage of GRUs over LSTMs with approximately 33% fewer parameters, resulting in faster training while maintaining comparable performance on most tasks.

📖

termini

Vanishing gradient

Learning problem in traditional RNNs that GRUs mitigate through their gating mechanisms that preserve the gradient over long temporal sequences.

📖

termini

Bidirectional sequence

Architecture using GRUs in the forward and backward directions of a sequence, capturing past and future context for better understanding of sequential data.

📖

termini

Dropout regularization

Technique applied to GRUs where certain connections are randomly deactivated during training to prevent overfitting and improve generalization.

Glossario IA