Busca avançada
Ano de início
Entree


RankMean: Module-Level Importance Score for Merging Fine-tuned Large Language Models

Autor(es):
Perin, Gabriel J. ; Chen, Xuxi ; Liu, Shusen ; Kailkhura, Bhavya ; Wang, Zhangyang ; Gallagher, Brian
Número total de Autores: 6
Tipo de documento: Artigo Científico
Fonte: FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024; v. N/A, p. 7-pg., 2024-01-01.
Resumo

Traditionally, developing new language models (LMs) capable of addressing multiple tasks involves fine-tuning pre-trained LMs using a wide collection of datasets, a process that often incurs significant computational expenses. Model merging emerges as a cost-effective alternative, allowing the integration of existing models fine-tuned on different tasks into a single model that performs well across all tasks, eliminating the need for additional training. In this paper, we propose RankMean, an algorithm for merging fine-tuned LMs without requiring any downstream data. RankMean determines merging coefficients based on the relative rankings of weight change magnitudes and applies these coefficients for module-wise integration of various fine-tuned models. Our experimental results demonstrate that RankMean outperforms existing baseline methods on multiple benchmarks. The code is available at github.com/VITA-Group/RankMean. (AU)

Processo FAPESP: 23/15047-4 - Fusão de Large Language Models
Beneficiário:Gabriel Jacob Perin
Modalidade de apoio: Bolsas no Exterior - Estágio de Pesquisa - Iniciação Científica
Processo FAPESP: 22/11645-1 - Classificação de estrelas, galáxias e quasares baseada em imagens fotométricas multibandas
Beneficiário:Gabriel Jacob Perin
Modalidade de apoio: Bolsas no Brasil - Iniciação Científica
Processo FAPESP: 22/15304-4 - Aprendizado de representações ricas em contexto para visão computacional
Beneficiário:Nina Sumiko Tomita Hirata
Modalidade de apoio: Auxílio à Pesquisa - Temático