Abstract
Multimodal transformer technology has stood out in recent years due to its ability to simultaneously integrate and process data from multiple modalities, such as text, images, audio, and time series. Thanks to technological advances, increasing volumes of multimodal data are being transmitted, and a growing number of multimodal application scenarios are emerging.In this context, multimoda…