Scholarship 23/15047-4 - Aprendizagem profunda, Processamento de linguagem natural - BV FAPESP
Advanced search
Start date
Betweenand

Model Merging for Large Language Model

Grant number: 23/15047-4
Support Opportunities:Scholarships abroad - Research Internship - Scientific Initiation
Start date: April 01, 2024
End date: July 31, 2024
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Nina Sumiko Tomita Hirata
Grantee:Gabriel Jacob Perin
Supervisor: Zhangyang Wang
Host Institution: Instituto de Matemática e Estatística (IME). Universidade de São Paulo (USP). São Paulo , SP, Brazil
Institution abroad: University of Texas at Austin (UT), United States  
Associated to the scholarship:22/11645-1 - Classification of stars, galaxies, and quasars based on photometric multiband images, BP.IC

Abstract

Ensemble methods have been successfully used to leverage diversity from different models, improving performance, at the cost of inference time. In the situation where the compounding models are neural networks that share the same architecture, merging methods have appeared as an alternative to ensembles, without the drawback of increasing inference time. This research project aims to study these techniques in the context of Large Language Models (LLMs), on Natural Language Processing tasks.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)