Advanced search
Start date
Betweenand

Large-scale data management for biodiversity: modeling and implementation of a relational database

Grant number: 25/04173-4
Support Opportunities:Scholarships in Brazil - Scientific Initiation
Start date: August 01, 2025
End date: July 31, 2026
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computer Systems
Principal Investigator:Dalton de Souza Amorim
Grantee:Leonardo Rizzo Costa
Host Institution: Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP). Universidade de São Paulo (USP). Ribeirão Preto , SP, Brazil
Associated research grant:21/14092-0 - Insect biodiversity in an Amazon tropical forest: species richness, vertical structure and faunistic turnover, AP.BTA.TEM

Abstract

Modern biodiversity studies of insects, using sequencing of individuals collected through passive methods, generate massive amounts of data-morphological, biogeographical, behavioral, biological, seasonal, etc., as well as more complex data such as images and genetic sequences. The accumulation and management of these data present a particularly relevant challenge: developing and utilizing effective and secure models for data storage, control, and querying. This Scientific Initiation project is related to a study on insect biodiversity in the Amazon (https://bv.fapesp.br/pt/auxilios/113041/biodiversidade-de-insetos-em-uma-floresta-tropical-amazonica-riqueza-de-especies-estrutura-vertical-/), which will collect approximately 4.5 million insects from five different forest strata in a primary forest area of the ZF2 Reserve at the National Institute for Amazonian Research, north of Manaus, sequencing around 320,000 individuals. This project aims to model and implement a relational database that includes the necessary features for accumulating data produced within the project's core and contributed by dozens of participants. The data will then be analyzed to answer the project's key questions and made available in major international databases. The basic information generated is divided into control data (users, experts, sample codes, forest height, location, unique specimen codes, etc.) and product data (morphological data, genetic sequences, identification at various levels, conducted analyses, statistical data, images, etc.). Some of the desired features include: ease of database loading, efficiency in data manipulation and querying, agility in global data analyses, access control, and storage security. The system must use a database management system that is comprehensive, open-source, and designed with succinct and efficient modeling, employing established principles and techniques. As a result, the project expects to develop a robust and reliable computational framework that enables optimized storage and management of the thematic project's information, while also offering advanced querying and analysis tools for project managers. The same database will be used for the storage and analysis of a sister project, involving the sequencing of 280,000 insect specimens collected from two other areas in Central Amazonia (Iranduba, west of the Rio Negro, and Careiro Castanho, south of the Rio Solimões). The architecture and associated applications will later be made available for additional large-scale projects studying insect biodiversity in Brazil, supporting research centers that are currently in the fundraising phase for the establishment of mirror laboratories. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)