Advanced search
Start date
Betweenand

Model Merging for Large Language Model

Grant number: 23/15047-4
Support Opportunities:Scholarships abroad - Research Internship - Scientific Initiation
Effective date (Start): April 01, 2024
Effective date (End): July 31, 2024
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Nina Sumiko Tomita Hirata
Grantee:Gabriel Jacob Perin
Supervisor: Zhangyang Wang
Host Institution: Instituto de Matemática e Estatística (IME). Universidade de São Paulo (USP). São Paulo , SP, Brazil
Research place: University of Texas at Austin (UT), United States  
Associated to the scholarship:22/11645-1 - Classification of stars, galaxies, and quasars based on photometric multiband images, BP.IC

Abstract

Ensemble methods have been successfully used to leverage diversity from different models, improving performance, at the cost of inference time. In the situation where the compounding models are neural networks that share the same architecture, merging methods have appeared as an alternative to ensembles, without the drawback of increasing inference time. This research project aims to study these techniques in the context of Large Language Models (LLMs), on Natural Language Processing tasks.

News published in Agência FAPESP Newsletter about the scholarship:
Articles published in other media outlets (0 total):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Please report errors in scientific publications list using this form.