Papers of the day   All papers

Sideways: Depth-Parallel Training of Video Models


DeepMind: Given the smoothness of videos, can we learn models more efficiently than with #backprop? We present Sideways - a step towards a high-throughput, approximate backprop that considers the one-way direction of time and pipelines forward and backward passes.

7 replies, 1051 likes

DeepMind: At this year’s #CVPR2020, our researchers present a more scalable approach to train video models. Sideways is a low-latency training algorithm that uses approximate gradients for updates. Read more about it here:

2 replies, 157 likes

Mateusz Malinowski: What if we drop the assumption that #backpropagation is instantaneous, and instead, we consider it as a process requiring time, and the time only moves forward. Can we use this property to pipeline forward and back passes, making it more parallel?@joaocarreira @GrzegorzMS Viorica

1 replies, 26 likes

Andrew Davison: Looks like interesting work on computational patterns for efficient training from sequential video data. Efficient continual learning of persistent models from incremental (real-time) data is exactly what I think I've always been most interested in. #SpatialAI

1 replies, 24 likes

Daisuke Okanohara: For training video models, we can use the activation computed from an adjacent frame to compute the gradient. This approximated gradient works well, and actually improves the generalization, which enables efficient depth parallelization.

0 replies, 6 likes

arXiv CS-CV: Sideways: Depth-Parallel Training of Video Models

0 replies, 3 likes

Alison B Lowndes ✿: Finally! Someone updates research papers. New #LaTeX, new backprop, from @DeepMind @MateuszOnAI, Viorica Pătrăucean, @GrzegorzMS @joaocarreira @sindero. PS double negative in the Conclusion hurting my brain??

0 replies, 2 likes

akira: Proposes an efficient learning method Sideaways for video analysis. Normally, BP is performed for each frame, but in Sideaway, parameters are updated using slightly future frame. Not only efficient, but also better accuracy because of regularization.

0 replies, 1 likes


Found on Jan 20 2020 at

PDF content of a computer science paper: Sideways: Depth-Parallel Training of Video Models