RankMean: Module-Level Importance Score for Merging Fine-tuned Large Language Models

Perin, Gabriel J.; Chen, Xuxi; Liu, Shusen; Kailkhura, Bhavya; Wang, Zhangyang; Gallagher, Brian

Autor(es):	Perin, Gabriel J. ; Chen, Xuxi ; Liu, Shusen ; Kailkhura, Bhavya ; Wang, Zhangyang ; Gallagher, Brian Número total de Autores: 6
Tipo de documento:	Artigo Científico
Fonte:	FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024; v. N/A, p. 7-pg., 2024-01-01.
Resumo
Traditionally, developing new language models (LMs) capable of addressing multiple tasks involves fine-tuning pre-trained LMs using a wide collection of datasets, a process that often incurs significant computational expenses. Model merging emerges as a cost-effective alternative, allowing the integration of existing models fine-tuned on different tasks into a single model that performs well across all tasks, eliminating the need for additional training. In this paper, we propose RankMean, an algorithm for merging fine-tuned LMs without requiring any downstream data. RankMean determines merging coefficients based on the relative rankings of weight change magnitudes and applies these coefficients for module-wise integration of various fine-tuned models. Our experimental results demonstrate that RankMean outperforms existing baseline methods on multiple benchmarks. The code is available at github.com/VITA-Group/RankMean. (AU)

Processo FAPESP:	23/15047-4 - Fusão de Large Language Models
Beneficiário:	Gabriel Jacob Perin
Modalidade de apoio:	Bolsas no Exterior - Estágio de Pesquisa - Iniciação Científica


Processo FAPESP:	22/11645-1 - Classificação de estrelas, galáxias e quasares baseada em imagens fotométricas multibandas
Beneficiário:	Gabriel Jacob Perin
Modalidade de apoio:	Bolsas no Brasil - Iniciação Científica


Processo FAPESP:	22/15304-4 - Aprendizado de representações ricas em contexto para visão computacional
Beneficiário:	Nina Sumiko Tomita Hirata
Modalidade de apoio:	Auxílio à Pesquisa - Temático

URL curto