RankMean: Module-Level Importance Score for Merging Fine-tuned Large Language Models

Perin, Gabriel J.; Chen, Xuxi; Liu, Shusen; Kailkhura, Bhavya; Wang, Zhangyang; Gallagher, Brian

Author(s):	Perin, Gabriel J. ; Chen, Xuxi ; Liu, Shusen ; Kailkhura, Bhavya ; Wang, Zhangyang ; Gallagher, Brian Total Authors: 6
Document type:	Journal article
Source:	FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024; v. N/A, p. 7-pg., 2024-01-01.
Abstract
Traditionally, developing new language models (LMs) capable of addressing multiple tasks involves fine-tuning pre-trained LMs using a wide collection of datasets, a process that often incurs significant computational expenses. Model merging emerges as a cost-effective alternative, allowing the integration of existing models fine-tuned on different tasks into a single model that performs well across all tasks, eliminating the need for additional training. In this paper, we propose RankMean, an algorithm for merging fine-tuned LMs without requiring any downstream data. RankMean determines merging coefficients based on the relative rankings of weight change magnitudes and applies these coefficients for module-wise integration of various fine-tuned models. Our experimental results demonstrate that RankMean outperforms existing baseline methods on multiple benchmarks. The code is available at github.com/VITA-Group/RankMean. (AU)

FAPESP's process:	23/15047-4 - Model Merging for Large Language Model
Grantee:	Gabriel Jacob Perin
Support Opportunities:	Scholarships abroad - Research Internship - Scientific Initiation


FAPESP's process:	22/11645-1 - Classification of stars, galaxies, and quasars based on photometric multiband images
Grantee:	Gabriel Jacob Perin
Support Opportunities:	Scholarships in Brazil - Scientific Initiation


FAPESP's process:	22/15304-4 - Learning context rich representations for computer vision
Grantee:	Nina Sumiko Tomita Hirata
Support Opportunities:	Research Projects - Thematic Grants

Short URL