High Frame Rate Optical Flow Estimation from Event Sensors via Intensity Estimation

Prasan Shedligeri

Kaushik Mitra

[Paper]

Raw frame
Event frame
Predicted flow
Predicted video

Abstract

Optical flow estimation forms the core of several computer vision tasks and its estimation requires accurate spatial and temporal gradient information. However, if there are fast-moving objects in the scene or if the camera moves rapidly, then the acquired images will suffer from motion blur, which will lead to poor optical flow estimation. Such challenging cases can be handled by event sensors which are a novel generation of sensors that acquire pixel-level brightness changes as binary events at a very high temporal resolution. Brightness constancy constraint, which is the basis of several optical flow algorithms cannot be directly used on event sensors making it challenging to estimate optical flow. We overcome this challenge by imposing brightness constancy constraint on intensity images predicted from event sensor data. For this task, we design a recurrent neural network that jointly predicts a sparse optical flow and intensity images from the event data. While intensity estimation is supervised using ground truth frames, optical flow estimation is self-supervised using the predicted intensity frames. However, in our case the temporal resolution of the ground truth intensity frames is far lower than the temporal resolution of the predicted intensity frames, making it challenging to supervise. As we use recurrent neural network, such a challenge can be overcome by sharing the weights for each of the predicted intensity frames. Quantitatively our predicted optical flow is better than previously proposed algorithms for optical flow estimation from event sensors. We also show our algorithm’s robustness against challenging cases of fast motion and high dynamic range scenes.

Talk

[Slides]

Key takeaways

We propose a semi-supervised learning algorithm to predict high frame rate optical flow for high dynamic range scenes.

Optical flow prediction is self-supervised using the high frame rate and high dynamic range intensity frames predicted directly from the event sensor data. Thus, ground truth optical flow is not necessary for training our proposed algorithm.

We also demonstrate the generalizability of our proposed algorithm on a wide variety of open source event datasets captured with different sensors and in different environments.

Algorithm

Overall flow of our proposed method: Our proposed methods takes in a single event frame at each time-step, which is then input to a ConvLSTM (Convolutional Long-Short Term Memory) network. The updated hidden state from the convLSTM network is input to an encoder network consisting of four strided convolutional layers followed by a ResNet block. The hidden representation from the encoder network is then fed as input to two decoder networks, decoderImg and decoderFlow, which predict the intensity image and the optical flow, respectively.

Paper and Supplementary Material

P. Shedligeri, Kaushik Mitra
High Frame Rate Optical Flow Estimation from Event Sensors via Intensity Estimation
Under review
(Available as Pre-print)

[Supplementary Material]

Related Publications

Prasan Shedligeri, Anupama S & Kaushik Mitra. (2021) A Unified Framework for Compressive Video Recovery from Coded ExposureTechniques. Accepted at IEEE/CVF Winter Conference on Applications of Computer Vision, doi to be assigned [Preprint] [Slides] [Supplementary] [Code] [Webpage]

Prasan Shedligeri, Anupama S & Kaushik Mitra. (2021) CodedRecon: Video reconstruction for coded exposure imaging techniques. Accepted at Elsevier Journal of Software Impacts, https://doi.org/10.1016/j.simpa.2021.100064 [Paper] [Code]

Acknowledgements

The authors would like to thank Matta Gopi Raju for collecting some of the data used in this and related publications.
This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.