基于信息蒸餾和共享注意力網絡的輕量級圖像超分辨率重建方法

劉世龍; 王雷; 劉春香; 楊善良; 李彬

doi:10.13374/j.issn2095-9389.2025.10.24.003

基于信息蒸餾和共享注意力網絡的輕量級圖像超分辨率重建方法

A lightweight image super-resolution reconstruction method based on information distillation and shared attention network

摘要

摘要: 當前基于卷積神經網絡（Convolutional Neural Network, CNN）和基于Transformer框架的圖像超分辨率重建方法已經取得了很大成功，但是這些方法還存在參數量過多、復雜度過高、內存需求較大等缺點。針對上述缺點，提出一種全新的基于信息蒸餾和共享注意力網絡（Information Distillation and Shared Attention Network, IDSA-Net）的輕量級圖像超分辨率重建方法，以在提升特征表達能力的同時降低計算負擔。首先，引入注意力共享機制，構建注意力共享蒸餾模塊，使后續模塊無需重復計算空間注意力矩陣，該部分是自注意力計算中的主要開銷，從而實現矩陣信息的跨層共享，減少計算量。此外，設計大核通道信息提純模塊，該模塊結合通道混洗操作，并利用大核注意力對蒸餾后的特征進行權重重分配，從而增強有用特征并抑制冗余信息。同時，提出新的實例歸一混合注意力模塊，學習通道特征和空間特征。最后，利用亞像素卷積層實現上采樣操作進行圖像重建。在Set5、Set14、BSD100和Urban100四個公開的基準數據集，通過對比和消融實驗的定量和可視化分析，所提方法在峰值信噪比（Peak Signal-to-Noise Ratio, PSNR）和結構相似性指數（Structural Similarity, SSIM）評價指標中優于十一種最先進的圖像超分辨率重建方法，驗證了其有效性和準確性。

Abstract: Lightweight super-resolution reconstruction is an important technology in the field of image processing and has been widely applied in various domains. Although current image super-resolution methods, such as those based on convolutional neural networks and Transformer models, have achieved significant success in this field, they still face considerable challenges and limitations, including excessive parameter counts, high computational complexity, and substantial memory requirements. To address these issues, a novel lightweight image super-resolution reconstruction method named the Information Distillation and Shared Attention Network (IDSA-Net) is proposed, aiming to reduce computational burden while enhancing feature representation capability. First, an attention-sharing mechanism is introduced to construct an Attention-Sharing Distillation Module, allowing subsequent modules to avoid repeatedly computing the spatial attention matrix, which constitutes the main computational overhead in self-attention calculations. This enables cross-layer sharing of matrix information and reduces computational costs. The module integrates local feature extraction with convolutional operations, while self-attention calculations are paired with sequence modeling units, thereby achieving information purification during the feature extraction and distillation stages. This enhances the network's ability to capture effective features and suppresses noise introduced during distillation. Furthermore, a Large-Kernel Channel Information Purification Module is designed, which combines channel shuffle operations and utilizes large-kernel depth wise separable convolutions from large-kernel attention to expand the receptive field. This enhances the model's perception of multi-scale features and global context, allowing for reweighting of the distilled features to strengthen useful information and suppress redundancies. Additionally, a novel Instance Normalization-based Hybrid Attention Module is proposed to learn channel and spatial features. In the channel attention phase, average pooling is performed separately along the width and height directions of the input features to mitigate information loss caused by traditional global pooling. Channel features are extracted using one-dimensional depth wise separable convolutions, avoiding the dimensionality reduction and expansion operations commonly used in conventional methods. This better preserves the integrity of channel features and improves the precision of information representation. In the spatial attention phase, instance normalization is applied to each channel of every sample individually, enabling the model to focus more on local structural features within the image rather than relying on global statistical information. Finally, sub-pixel convolutional layers are employed for up sampling and image reconstruction. Quantitative and visual analyses through comparative and ablation experiments on four public benchmark datasets—Set5, Set14, BSD100, and Urban100—demonstrate that the proposed method outperforms eleven state-of-the-art image super-resolution methods in terms of Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM), validating its effectiveness and accuracy.

HTML全文

參考文獻(0)

施引文獻

資源附件(0)