Video and Temporal Diffusion
3D Attention
Attention mechanism that simultaneously processes spatial (height, width) and temporal (time) dimensions of a video, allowing the model to weight the importance of different regions across different moments to capture spatio-temporal dependencies.
← Quay lại