Advanced search
Start date
Betweenand

Using the Grouping Operator from SQL for Data Preparation processes (ETL) using similarity

Grant number: 24/22291-1
Support Opportunities:Scholarships in Brazil - Scientific Initiation
Start date: March 01, 2025
End date: February 28, 2026
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Caetano Traina Junior
Grantee:Sandy da Costa Dutra
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Associated research grant:23/18026-8 - Center for Data Science in Public Statistics, AP.CCD

Abstract

This is an exploratory scientific project that focuses on data processing activities target to support the formulation of public policies. It applies new technologies and data management tools for processes primarily involving the extraction, preparation, and load/integration of data (ETL), carried out using Database Management Systems (DBMS). The project will focus on integrating data from multiple sources, represented in various formats and with different criteria for the generation, representation, and storage techniques, wich are intended to be used in analytical tools to support decision-making.The objective of the research project is to present and guide the student in the most common data mining processes through clustering and apply them to integrate data from several environments of multiple partner institutions of the Center for Data Science for Public Statistics (CCDEP-FAPESP), coordinated by Fundação SEADE. This will involve studying the formats, granularity, scope, and quality of the data, also considering the collection periodicity and frequency of integration.The algorithms will be coded and executed in SQL on datasets provided by the project and stored in a Relational Database Management System (RDBMS), generating integration processes that can be explored within the context of the overall project.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)