A4 Vertaisarvioitu artikkeli konferenssijulkaisussa
ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring
Tekijät: Li Dongxu, Xu Chenchen, Zhang Kaihao, Yu Xin, Zhong Yiran, Ren Wenqi, Suominen Hanna, Li Hongdong
Konferenssin vakiintunut nimi: IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Julkaisuvuosi: 2021
Journal: IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Kokoomateoksen nimi: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Tietokannassa oleva lehden nimi: 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021
Lehden akronyymi: PROC CVPR IEEE
Aloitussivu: 7717
Lopetussivu: 7727
Sivujen määrä: 11
ISSN: 1063-6919
DOI: https://doi.org/10.1109/CVPR46437.2021.00763
Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions. In order to utilize neighboring sharp patches, typical methods rely mainly on homography or optical flows to spatially align neighboring blurry frames. However, such explicit approaches are less effective in the presence of fast motions with large pixel displacements. In this work, we propose a novel implicit method to learn spatial correspondence among blurry frames in the feature space. To construct distant pixel correspondences, our model builds a correlation volume pyramid among all the pixel-pairs between neighboring frames. To enhance the features of the reference frame, we design a correlative aggregation module that maximizes the pixel-pair correlations with its neighbors based on the volume pyramid. Finally, we feed the aggregated features into a reconstruction module to obtain the restored frame. We design a generative adversarial paradigm to optimize the model progressively. Our proposed method is evaluated on the widely-adopted DVD dataset, along with a newly collected High-Frame-Rate (1000 fps) Dataset for Video Deblurring (HFR-DVD). Quantitative and qualitative experiments show that our model performs favorably on both datasets against previous state-of-the-art methods, confirming the benefit of modeling all-range spatial correspondence for video deblurring.