Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2025, Vol. 59 Issue (5): 956-963    DOI: 10.3785/j.issn.1008-973X.2025.05.009
    
Video snapshot compressive imaging reconstruction based on temporal super-resolution
Zan CHEN(),Ran LI,Yuanjing FENG,Yongqiang LI
College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
Download: HTML     PDF(1398KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A voxel flow-based deep unfolding reconstruction framework was proposed to perform time-dimensional super-resolution on the reconstructed video frames aiming at the problems of high reconstruction hardware burden and poor reconstruction quality of video snapshot compressed imaging (SCI) due to small compressive sampling rate. A deep denoising network was proposed based on optimized iteration to iteratively reconstruct the initial frames. The video features of the denoising network were converted into voxel flow features in order to estimate the voxel information. A motion regularizer was constructed based on voxel streams in order to compute time-dimensional super-resolved frames by using voxels from the original frames. Group convolution was combined in the model to fuse the voxel stream information at different stages to reduce the loss of motion information. The experimental results showed that the average reconstructed peak signal-to-noise ratio on the benchmark dataset was improved by 0.23 dB compared to the comparison method, and the visual quality of reconstructed frames was higher. The compressive sampling rate of the video SCI system can be significantly reduced with the same frame rate of the reconstructed video by using the proposed method in order to maintain high quality reconstruction results.



Key wordssnapshot compressive imaging      compressive sensing      voxel flow      deep learning      super-resolution     
Received: 09 April 2024      Published: 25 April 2025
CLC:  TP 391  
Fund:  国家自然科学基金资助项目(62002327);浙江省自然科学基金资助项目(LQ21F020017).
Cite this article:

Zan CHEN,Ran LI,Yuanjing FENG,Yongqiang LI. Video snapshot compressive imaging reconstruction based on temporal super-resolution. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 956-963.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2025.05.009     OR     https://www.zjujournals.com/eng/Y2025/V59/I5/956


基于时间维超分辨率的视频快照压缩成像重构

针对视频快照压缩成像(SCI)因压缩采样率较小所导致的重构硬件负担高、重构质量差的问题,提出基于体素流的深度展开式重构框架,对重构视频进行时间维度超分. 基于优化迭代提出深度去噪网络,对初始帧进行迭代重构. 将去噪网络的视频特征转换为体素流特征,以估计体素信息. 基于体素流构造运动正则化器,使用原始帧的体素计算时间维超分辨的帧. 在模型中结合群卷积,融合不同阶段的体素流信息,减少运动信息损失. 实验结果表明,在基准数据集上的平均重构峰值信噪比相较于对比方法提高了0.23 dB,重构帧视觉质量更高. 在重构视频帧率相同的情况下,利用提出的方法能够显著降低视频SCI系统的压缩采样率,保持高质量的重构结果.


关键词: 快照压缩成像,  压缩感知,  体素流,  深度学习,  超分辨率 
Fig.1 Schematic diagram of SCI system video acquisition and reconstruction
Fig.2 Schematic of time-dimensional super-resolution model based on voxel flow
Fig.3 Details of denoiser $ {{D}^k} $ and motion regularizer $ {{M}^k} $ of each iteration phase of proposed model
对比方法PSNR/dB, SSIMtr/s
KobeTrafficRunnerDropAerialCrash平均值
Tensor-FISTA[19]25.02, 0.80422.71, 0.82230.32, 0.94234.36, 0.97125.95, 0.87625.50, 0.89127.31, 0.8840.0166
E2E-CNN[17]26.24, 0.82024.53, 0.88833.80, 0.97436.66, 0.98927.29, 0.91526.30, 0.91229.14, 0.9170.0098
RevSCI[18]27.51, 0.88424.87, 0.89834.05, 0.97637.70, 0.99026.97, 0.91226.31, 0.91529.57, 0.9290.1412
ISTA-Rev-AE[29]25.91, 0.81124.09, 0.87033.07, 0.97738.05, 0.97026.73, 0.90326.18, 0.90828.96, 0.9090.0481
GAP-Unet-S12[20]27.48, 0.85625.55, 0.90735.29, 0.98037.18, 0.99227.90, 0.92426.83, 0.92530.04, 0.9310.0327
HQS-RevSCI[30]27.59, 0.87525.14, 0.90234.30, 0.97738.15, 0.99027.27, 0.91726.32, 0.91429.79, 0.9290.4136
FISTA-Rev-AE-3D[31]27.58, 0.86525.59, 0.90935.03, 0.97938.59, 0.99127.58, 0.92126.57, 0.91530.16, 0.9300.1281
SCI-OF[32]29.03, 0.91626.83, 0.93335.32, 0.98039.68, 0.99228.07, 0.93227.34, 0.93931.04, 0.9490.2476
EfficientSCI[33]25.25, 0.82622.65, 0.82831.34, 0.96235.51, 0.98426.02, 0.89025.52, 0.89827.71, 0.8980.0206
Res2former[34]26.54, 0.85824.32, 0.86833.42, 0.97337.90, 0.98927.30, 0.91626.58, 0.92429.34, 0.9210.0216
本文方法28.87, 0.91227.05, 0.93536.29, 0.98239.82, 0.99328.29, 0.93327.35, 0.94131.27, 0.9490.2509
Tab.1 Comparison results and running time of simulated data reconstruction, the left data of the cell is PSNR and the right is SSIM
Fig.4 Visual comparison of reconstruction result of different methods on gray-scale simulated data
Fig.5 Visual comparison of reconstruction result of different methods on real collected data
对比方法PSNR/dB, SSIMtr/s
KobeTrafficRunnerDropAerialCrash平均值
重提取VF26.31, 0.84722.89, 0.83231.93, 0.95935.21, 0.97926.66, 0.89925.87, 0.90628.15, 0.9040.461 5
w/o MR26.52, 0.83225.16, 0.90134.73, 0.97838.22, 0.99027.35, 0.91826.35, 0.91329.72, 0.9220.071 8
w/o CF28.68, 0.90826.69, 0.93135.13, 0.98039.47, 0.99228.06, 0.93127.27, 0.93630.89, 0.9460.248 3
w/o GC28.55, 0.90126.65, 0.93135.67, 0.98139.70, 0.99227.99, 0.93027.15, 0.93630.95, 0.9450.211 2
本文方法28.87, 0.91227.05, 0.93536.29, 0.98239.82, 0.99328.29, 0.93327.35, 0.94131.27, 0.9490.250 9
Tab.2 Comparison of reconstruction result of ablation experiment on average PSNR, SSIM and running time
Fig.6 Comparison of reconstruction result of ablation studies for number of original observed frame
Fig.7 Comparison of reconstruction result of ablation studies for maximum phase number K
[1]   CHEN Z, GUO W, FENG Y, et al Deep-learned regularization and proximal operator for image compressive sensing[J]. IEEE Transactions on Image Processing, 2021, 30: 7112- 7126
[2]   QIAO M, LIU X, YUAN X Snapshot spatial–temporal compressive imaging[J]. Optics Letters, 2020, 45 (7): 1659- 1662
doi: 10.1364/OL.386238
[3]   LU R, CHEN B, LIU G, et al Dual-view snapshot compressive imaging via optical flow aided recurrent neural network[J]. International Journal of Computer Vision, 2021, 129 (12): 3279- 3298
doi: 10.1007/s11263-021-01532-1
[4]   LLULL P, LIAO X, YUAN X, et al Coded aperture compressive temporal imaging[J]. Optics Express, 2013, 21 (9): 10526- 10545
doi: 10.1364/OE.21.010526
[5]   YUAN X, BRADY D, KATSAGGELOS A Snapshot compressive imaging: theory, algorithms, and applications[J]. IEEE Signal Processing Magazine, 2021, 38 (2): 65- 88
doi: 10.1109/MSP.2020.3023869
[6]   SUN Y, YUAN X, PANG S Compressive high-speed stereo imaging[J]. Optics Express, 2017, 25 (15): 18182- 18190
doi: 10.1364/OE.25.018182
[7]   ZHANG Z, DENG C, LIU Y, et al Ten-mega-pixel snapshot compressive imaging with a hybrid coded aperture[J]. Photonics Research, 2021, 9 (11): 2277- 2287
doi: 10.1364/PRJ.435256
[8]   ZHAN C, HU H, SUI X, et al Joint resource allocation and 3D aerial trajectory design for video streaming in UAV communication systems[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 31 (8): 3227- 3241
[9]   LIN F, FU C, HE Y, et al Learning temporary block-based bidirectional incongruity-aware correlation filters for efficient UAV object tracking[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 31 (6): 2160- 2174
[10]   LIU Y, YUAN X, SUO J, et al Rank minimization for snapshot compressive imaging[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41 (12): 2990- 3006
[11]   YUAN X, LIU Y, SUO J, et al. Plug-and-play algorithms for large-scale snapshot compressive imaging [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle: IEEE, 2020: 1447-1457.
[12]   YUAN X, LIU Y, SUO J, et al Plug-and-play algorithms for video snapshot compressive imaging[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44 (10): 7093- 7111
[13]   YANG J, YUAN X, LIAO X, et al Video compressive sensing using Gaussian mixture models[J]. IEEE Transactions on Image Processing, 2014, 23 (11): 4863- 4878
doi: 10.1109/TIP.2014.2344294
[14]   SHI B, WANG Y, LI D Provable general bounded denoisers for snapshot compressive imaging with convergence guarantee[J]. IEEE Transactions on Computational Imaging, 2023, 9 (2): 55- 69
[15]   SHI B, LI D, WANG Y, et al Provable deep video denoiser using spatial–temporal information for video snapshot compressive imaging: algorithm and convergence analysis[J]. Signal Processing, 2024, 214 (1): 109236
[16]   SHI B, WANG Y, LIAN Q. A trainable bounded denoiser using double tight frame network for snapshot compressive imaging [C]// IEEE International Conference on Acoustics, Speech and Signal Processing . Singapore: IEEE, 2022: 1516-1520.
[17]   QIAO M, MENG Z, MA J, et al Deep learning for video compressive sensing[J]. Apl Photonics, 2020, 5 (3): 030801
doi: 10.1063/1.5140721
[18]   CHENG Z, CHEN B, LIU G, et al. Memory-efficient network for large-scale video compressive sensing [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition . Nashville: IEEE, 2021: 16246-16255.
[19]   HAN X, WU B, SHOU Z, et al. Tensor FISTA-Net for real-time snapshot compressive imaging [C]// Proceedings of the AAAI Conference on Artificial Intelligence . New York: AAAI, 2020, 34(7): 10933-10940.
[20]   MENG Z, YUAN X, JALALI S Deep unfolding for snapshot compressive imaging[J]. International Journal of Computer Vision, 2023, 131 (11): 2933- 2958
doi: 10.1007/s11263-023-01844-4
[21]   WANG Z, ZHANG H, CHENG Z, et al. Metasci: scalable and adaptive reconstruction for video compressive sensing [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition . Nashville: IEEE, 2021: 2083-2092.
[22]   NIKLAUS S, MAI L, LIU F. Video frame interpolation via adaptive convolution [C]// IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 670-679.
[23]   LIU Z, YEH R, TANG X, et al. Video frame synthesis using deep voxel flow [C]// IEEE International Conference on Computer Vision . Venice: IEEE, 2017: 4463-4471.
[24]   ZHANG Y, LIU X, WU B, et al. Video synthesis via transform-based tensor neural network [C]// Proceedings of the 28th ACM International Conference on Multimedia . Melbourne: ACM, 2020: 2454-2462.
[25]   KRIZHEVSKY A, SUTSKEVER I, HINTON G ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60 (6): 84- 90
doi: 10.1145/3065386
[26]   XIE S, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks [C]// IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 1492-1500.
[27]   HUANG G, LIU S, MAATEN L, et al. Condensenet: an efficient DenseNet using learned group convolutions [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City: IEEE, 2018: 2752-2761.
[28]   MIAO Y, ZHAO X, WANG J, et al Snapshot compressive imaging using domain-factorized deep video prior[J]. IEEE Transactions on Computational Imaging, 2024, 10 (1): 93- 102
[29]   LI S, ZHENG Z, DAI W, et al. REV-AE: a learned frame set for image reconstruction [C]// IEEE International Conference on Acoustics, Speech and Signal Processing. Barcelona: IEEE, 2020: 1823-1827.
[30]   WU Z, ZHANG J, MOU C. Dense deep unfolding network with 3D-CNN prior for snapshot compressive imaging [C]// IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 4872-4881.
[31]   LI S, DAI W, ZHENG Z, et al Reversible autoencoder: a CNN-based nonlinear lifting scheme for image reconstruction[J]. IEEE Transactions on Signal Processing, 2021, 69 (5): 3117- 3131
[32]   CHEN Z, LI R, LI Y, et al. Video snapshot compressive imaging via optical flow [C]// IEEE International Conference on Multimedia and Expo . Brisbane: IEEE, 2023: 2177-2182.
[33]   WANG L, CAO M, YUAN X. Efficientsci: densely connected network with space-time factorization for large-scale video snapshot compressive imaging [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition . Vancouver: IEEE, 2023: 18477-18486.
[1] Mingzhi HU,Jun SUN,Biao YANG,Kairong CHANG,Junlong YANG. Super-resolution reconstruction of remote sensing image based on CNN and Transformer aggregation[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 938-946.
[2] Li MA,Yongshun WANG,Yao HU,Lei FAN. Pre-trained long-short spatiotemporal interleaved Transformer for traffic flow prediction applications[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 669-678.
[3] Qiaohong CHEN,Menghao GUO,Xian FANG,Qi SUN. Image captioning based on cross-modal cascaded diffusion model[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 787-794.
[4] Zhengyu GU,Feifei LAI,Chen GENG,Ximing WANG,Yakang DAI. Knowledge-guided infarct segmentation of ischemic stroke[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 814-820.
[5] Minghui YAO,Yueyan WANG,Qiliang WU,Yan NIU,Cong WANG. Siamese networks algorithm based on small human motion behavior recognition[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(3): 504-511.
[6] Liming LIANG,Pengwei LONG,Jiaxin JIN,Renjie LI,Lu ZENG. Steel surface defect detection algorithm based on improved YOLOv8s[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(3): 512-522.
[7] Kaibo YANG,Mingen ZHONG,Jiawei TAN,Zhiying DENG,Mengli ZHOU,Ziji XIAO. Small-scale sparse smoke detection in multiple fire scenarios based on semi-supervised learning[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(3): 546-556.
[8] Zhichao CHEN,Jie YANG,Fan LI,Zhicheng FENG. Review on deep learning-based key algorithm for train running environment perception[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(1): 1-17.
[9] Dengfeng LIU,Shihai CHEN,Wenjing GUO,Zhilei CHAI. Efficient halftone algorithm based on lightweight residual networks[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(1): 62-69.
[10] Yi ZHAO,Chun AN,Minghao LI,Jianxiao MA,Shuo HUAI. Selection of lane-changing distance for vehicles in urban expressway interchange weaving section[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(1): 205-212.
[11] Fan LI,Jie YANG,Zhicheng FENG,Zhichao CHEN,Yunxiao FU. Pantograph-catenary contact point detection method based on image recognition[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(9): 1801-1810.
[12] Li XIAO,Zhigang CAO,Haoran LU,Zhijian HUANG,Yuanqiang CAI. Elastic metamaterial design based on deep learning and gradient optimization[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(9): 1892-1901.
[13] Shuhan WU,Dan WANG,Yuanfang CHEN,Ziyu JIA,Yueqi ZHANG,Meng XU. Attention-fused filter bank dual-view graph convolution motor imagery EEG classification[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(7): 1326-1335.
[14] Linrui LI,Dongsheng WANG,Hongjie FAN. Fact-based similar case retrieval methods based on statutory knowledge[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(7): 1357-1365.
[15] Xianwei MA,Chaohui FAN,Weizhi NIE,Dong LI,Yiqun ZHU. Robust fault diagnosis method for failure sensors[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(7): 1488-1497.