Advanced search
Start date
Betweenand


Building a biological knowledge graph via Wikidata with a focus on the Human Cell Atlas

Full text
Author(s):
Tiago Lubiana Alves
Total Authors: 1
Document type: Doctoral Thesis
Press: São Paulo.
Institution: Universidade de São Paulo (USP). Instituto de Matemática e Estatística (IME/SBI)
Defense date:
Examining board members:
Helder Takashi Imoto Nakaya; Yesid Cuesta Astroz; Helena Paula Brentani; Jose Eduardo Santarem Segundo
Advisor: Helder Takashi Imoto Nakaya
Abstract

With the advancements in the Human Cell Atlas and single-cell omics technologies (such as single-cell RNA-seq), the need for strategies to systematically organize knowledge about cell types has grown. Formal representation systems are essential for tasks such as managing databases and annotating omics datasets. The Wikidata infrastructure, integrated with Wikipedia, offers a valuable resource for bioinformaticians seeking structured biocurated data. We utilized it to develop WikiORA, an interactive web platform for functional enrichment analysis. Since WikiORA and similar tools rely on Wikidata\'s coverage, we enhanced its content using two leading databases: PanglaoDB, for cell markers, and the Complex Portal, for protein complexes. Alongside integrating external sources, we explored how Wikidata could be enriched via de novo biocuration, creating a system to catalog cell diversity. As a result, we transformed Wikidata into the world\'s largest multi-species catalog of cell classes, assigning unique identifiers to over 6,000 entries. The curated data are publicly accessible through a graphical interface and a SPARQL endpoint. By adhering to the 5-star Linked Open Data standard, we enabled efficient reuse of the data, supporting the development of a multilingual Cell Ontology and powering automated Wikipedia infoboxes. In summary, this case study highlights Wikidatas value as a knowledge representation tool in the life sciences, particularly for organizing information on human cell diversity. (AU)

FAPESP's process: 19/26284-1 - Building a biological knowledge graph via Wikidata with a focus on the Human Cell Atlas
Grantee:Tiago Lubiana Alves
Support Opportunities: Scholarships in Brazil - Doctorate