Advanced search
Start date
Betweenand

Multi-objective optimal selection of benchmarking datasets for unbiased and efficient machine learning algorithm evaluation

Grant number: 23/10419-0
Support Opportunities:Scholarships abroad - Research Internship - Post-doctor
Effective date (Start): January 01, 2024
Effective date (End): December 31, 2024
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Ana Carolina Lorena
Grantee:João Luiz Junho Pereira
Supervisor: Kate Smith-Miles
Host Institution: Divisão de Ciência da Computação (IEC). Instituto Tecnológico de Aeronáutica (ITA). Ministério da Defesa (Brasil). São José dos Campos , SP, Brazil
Research place: University of Melbourne, Australia  
Associated to the scholarship:22/10683-7 - Is my benchmark of datasets challenging enough?, BP.PD

Abstract

Artificial Intelligence revolutionized several areas of human knowledge and became very popular in the last decade. Supervised Machine Learning (ML) algorithms are the main protagonists in this revolution and whenever a new supervised ML algorithm is developed or presented, it is crucial to assess its predictive performance across diverse datasets to identify its strengths and weaknesses and situations where it can be most useful. However, current literature shows that these testing benchmarks of datasets are typically gathered from public repositories, being the selection often ad-hoc and lacking of specific criteria. Studies are necessary to propose benchmark datasets that properly evaluate ML algorithms. Studies have already started in the Post-doctoral Scholarship uniting two major and complex areas: meta-learning and optimization. The first identifies the main characteristics and general aspects of each dataset, relating them to the predictive performance of ML models, while the last can be applied on top of these information to select a subset of datasets that best responds to one or more defined objectives. Numerous studies can be done by varying the objectives and optimization techniques, in particular the application of multi-objective optimization, where several so-called non-dominated solutions are found and their selection requires complex decision-making tools. While the Australian group from University of Melbourne has extensively worked on analyzing testing benchmarks by meta-learning, the Brazilian team has interest in optimizing the selection of subsets of datasets which are diverse enough to challenge the ML algorithms in different ways, paving the way towards a better characterization of the domains of competence of each algorithm. This BEPE proposal aims to bring together both teams for advancing this narrow and necessary knowledge frontier necessary to improve the reliability in the usage of ML models. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Please report errors in scientific publications list using this form.