NeuralMarker: A Framework for Learning General Marker Correspondence

ACM Transactions on Graphics (SIGGRAPH Asia 2022)

Zhaoyang Huang1, 3*, Xiaokun Pan2*, Weihong Pan2, Weikang Bian1, Yan Xu1,
Ka Chun Cheung3, Guofeng Zhang2, Hongsheng Li1

1MMLab, The Chinese University of Hong Kong    2State Key Lab of CAD & CG, Zhejiang University    3NVIDIA AI Technology Center, NVIDIA   
* denotes equal contributions


We tackle the problem of estimating correspondences from a general marker, such as a movie poster, to an image that captures such a marker. Conventionally, this problem is addressed by fitting a homography model based on sparse feature matching. However, they are only able to handle plane-like markers and the sparse features does not sufficiently utilize appearance information. In this paper, we propose a novel framework NeuralMarker, training a neural network estimating dense marker correspondences under various challenging conditions, such as marker deformation, harsh lighting, etc. Deep learning has presented excellent performance in correspondence learning once provided sufficient training data. However, annotating pixel-wise dense correspondence for training marker correspondence is too expensive. We observe that the challenges of marker correspondence estimation come from two individual aspects: geometry variation and appearance variation. We therefore design two components addressing these two challenges in NeuralMarker. First, we create a synthetic dataset FlyingMarkers containing marker-image pairs with ground truth dense correspondences. By training with FlyingMarkers, the neural network is encouraged to capture various marker motion. Second, we propose the novel Symmetric Epipolar Distance (SED) loss, which enables learning dense correspondence from posed images. Learning with the SED loss and the cross-lighting posed images collected by Structure-from-Motion (SfM), NeuralMarker is remarkably robust in harsh lighting environments and avoids the synthetic image bias. Besides, we also propose a novel marker correspondence evaluation method circumstancing annotations on real marker-image pairs and create a new benchmark. We show that NeuralMarker significantly outperforms previous methods and enables new interesting applications, including Augmented Reality (AR) and video editing.


(a) The marker correspondence predicted by our NeuralMarker for an offhand marker. (b) We can easily embed advertisement into movies and TV series via NeuralMarker. (c) We can edit a frame in a video clip and propagate the editing effects to the whole video clip. (d) The marker-based Augmented Reality (AR).


Augmented Reality with Deformed Markers

Fast Video Editing

Augmented Reality with Harsh Lighting and Fast Motion

Lighting-Preserved Image Editing with NIID-Net

warping Comparison

Qualitative Comparison

warping Comparison

Motion Blur

Marker Deformation

Harsh Lighting

Extreme Viewpoint

More Comparison


We thank Rensen Xu, Yijin Li and Jundan Luo for their help, and Qianhao Quan for providing excellent materials. Hong- sheng Li is also a Principal Investigator of Centre for Perceptual and Interactive Intelligence Limited (CPII). This work is supported in part by CPII, in part by the General Research Fund through the Re- search Grants Council of Hong Kong under Grants (Nos. 14204021, 14207319) and in part by ZJU-SenseTime Joint Lab of 3D Vision.