Advanced search
Start date
Betweenand


RankMean: Module-Level Importance Score for Merging Fine-tuned Large Language Models

Author(s):
Perin, Gabriel J. ; Chen, Xuxi ; Liu, Shusen ; Kailkhura, Bhavya ; Wang, Zhangyang ; Gallagher, Brian
Total Authors: 6
Document type: Journal article
Source: FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024; v. N/A, p. 7-pg., 2024-01-01.
Abstract

Traditionally, developing new language models (LMs) capable of addressing multiple tasks involves fine-tuning pre-trained LMs using a wide collection of datasets, a process that often incurs significant computational expenses. Model merging emerges as a cost-effective alternative, allowing the integration of existing models fine-tuned on different tasks into a single model that performs well across all tasks, eliminating the need for additional training. In this paper, we propose RankMean, an algorithm for merging fine-tuned LMs without requiring any downstream data. RankMean determines merging coefficients based on the relative rankings of weight change magnitudes and applies these coefficients for module-wise integration of various fine-tuned models. Our experimental results demonstrate that RankMean outperforms existing baseline methods on multiple benchmarks. The code is available at github.com/VITA-Group/RankMean. (AU)

FAPESP's process: 23/15047-4 - Model Merging for Large Language Model
Grantee:Gabriel Jacob Perin
Support Opportunities: Scholarships abroad - Research Internship - Scientific Initiation
FAPESP's process: 22/11645-1 - Classification of stars, galaxies, and quasars based on photometric multiband images
Grantee:Gabriel Jacob Perin
Support Opportunities: Scholarships in Brazil - Scientific Initiation
FAPESP's process: 22/15304-4 - Learning context rich representations for computer vision
Grantee:Nina Sumiko Tomita Hirata
Support Opportunities: Research Projects - Thematic Grants