LIFE: Lighting Invariant Flow Estimation

arXiv 2021

Zhaoyang Huang1, 3*, Xiaokun Pan2*, Runsen Xu2, Yan Xu1,
Ka Chun Cheung3, Guofeng Zhang2, Hongsheng Li1

1CUHK-SenseTime Joint Laboratory, The Chinese University of Hong Kong   
2State Key Lab of CAD & CG, Zhejiang University   
3NVIDIA AI Technology Center, NVIDIA   
* denotes equal contributions



We tackle the problem of estimating flow between two images with large lighting variations. Recent learning-based flow estimation frameworks have shown remarkable performance on image pairs with small displacement and constant illuminations, but cannot work well on cases with large viewpoint change and lighting variations because of the lack of pixel-wise flow annotations for such cases. We observe that via the Structure-from-Motion (SfM) techniques, one can easily estimate relative camera poses between image pairs with large viewpoint change and lighting variations. We propose a novel weakly supervised framework LIFE to train a neural network for estimating accurate lighting-invariant flows between image pairs. Sparse correspondences are conventionally established via feature matching with descriptors encoding local image contents. However, local image contents are inevitably ambiguous and error-prone during the cross-image feature matching process, which hinders downstream tasks. We propose to guide feature matching with the flows predicted by LIFE, which addresses the ambiguous matching by utilizing abundant context information in the image pairs. We show that LIFE outperforms previous flow learning frameworks by large margins in challenging scenarios, consistently improves feature matching, and benefits downstream tasks.

Replacing the picture in videos via predicted flows

We select the painting The Starry Night and print it on an A4 paper. We record videos that capture the paper in different situations and extract frames from them. Then we use LIFE to individually predict flows from the painting to the frames and warp other images to the frames via the flows. Note that the image is warped via flows only and we do not use any other strategies. The excellent visual effect of the video synthesized via LIFE demonstrates the high robustness and accuracy of LIFE, which also reveals potential augmented reality~(AR) applications according to LIFE.

Captured videos

Motion Blur

Viewpoint variation

Lighting variation

Surface variation


Feature matching with and without flow guidance

FeatureMatching Comparison

Flow comparison on the KITTI

KITTI Comparison

Image warping via predicted flows

warping Comparison