ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring - UTU Tutkimustietojärjestelmä

A4 Vertaisarvioitu artikkeli konferenssijulkaisussa

ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring

Tekijät: Li Dongxu, Xu Chenchen, Zhang Kaihao, Yu Xin, Zhong Yiran, Ren Wenqi, Suominen Hanna, Li Hongdong

Konferenssin vakiintunut nimi: IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Julkaisuvuosi: 2021

Journal: IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Kokoomateoksen nimi: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Tietokannassa oleva lehden nimi: 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021

Lehden akronyymi: PROC CVPR IEEE

Aloitussivu: 7717

Lopetussivu: 7727

Sivujen määrä: 11

ISSN: 1063-6919

DOI: https://doi.org/10.1109/CVPR46437.2021.00763

Tiivistelmä

Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions. In order to utilize neighboring sharp patches, typical methods rely mainly on homography or optical flows to spatially align neighboring blurry frames. However, such explicit approaches are less effective in the presence of fast motions with large pixel displacements. In this work, we propose a novel implicit method to learn spatial correspondence among blurry frames in the feature space. To construct distant pixel correspondences, our model builds a correlation volume pyramid among all the pixel-pairs between neighboring frames. To enhance the features of the reference frame, we design a correlative aggregation module that maximizes the pixel-pair correlations with its neighbors based on the volume pyramid. Finally, we feed the aggregated features into a reconstruction module to obtain the restored frame. We design a generative adversarial paradigm to optimize the model progressively. Our proposed method is evaluated on the widely-adopted DVD dataset, along with a newly collected High-Frame-Rate (1000 fps) Dataset for Video Deblurring (HFR-DVD). Quantitative and qualitative experiments show that our model performs favorably on both datasets against previous state-of-the-art methods, confirming the benefit of modeling all-range spatial correspondence for video deblurring.