VITMST plus plus : Efficient Hyperspectral Reconstruction Through Vision Transformer-Based Spatial Compression

Silveira, Ana C. Caznok; do Carmo, Diedre S.; Ueda, Lucas H.; Fantinato, Denis G.; Costa, Paula D. P.; Rittner, Leticia

Texto completo
Autor(es):	Silveira, Ana C. Caznok ; do Carmo, Diedre S. ; Ueda, Lucas H. ; Fantinato, Denis G. ; Costa, Paula D. P. ; Rittner, Leticia Número total de Autores: 6
Tipo de documento:	Artigo Científico
Fonte:	IEEE OPEN JOURNAL OF SIGNAL PROCESSING; v. 6, p. 7-pg., 2025-01-01.
Resumo
Hyperspectralchannel reconstruction transforms a subsampled multispectral image into hyperspectral imaging, providing higher spectral resolution without a dedicated acquisition hardware and camera. Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (MST++) is a state-of-the-art channel reconstruction technique, but it faces memory limitations for high spatial-resolution images. In this context, we introduced VITMST++, a novel architecture incorporating Vision Transformer embeddings for spatial compression, multi-resolution image context, and a custom channel-weighted loss. Developed for the ICASSP 2024 HyperSkin Challenge, VITMST++ outperforms the state-of-the-art MST++ in both performance and computational efficiency in channel reconstruction. In this work, we perform a deeper analysis on the main aspects of VITMST++ efficiency, quantitative performance, and generalization to other datasets. Results show that VITMST++ achieves similar values of SAM and SSIM hyperspectral reconstruction metrics when compared to state-of-the-art methods, while consuming up to three fold less memory and needing up to 10 times fewer multiply-add operations. (AU)

Processo FAPESP:	20/09838-0 - BI0S - Brazilian Institute of Data Science
Beneficiário:	João Marcos Travassos Romano
Modalidade de apoio:	Auxílio à Pesquisa - Programa Centros de Pesquisa em Engenharia

URL curto