Busca avançada
Ano de início
Entree


Biological Sequence Analysis Using Complex Networks and Entropy Maximization: A Case Study in SARS-CoV-2

Texto completo
Autor(es):
Pimenta-Zanon, Matheus H. ; De Souza, Vinicius Augusto ; Hashimoto, Ronaldo Fumio ; Lopes, Fabricio Martins ; Swarnkar, T ; Patnaik, S ; Mitra, P ; Misra, S ; Mishra, M
Número total de Autores: 9
Tipo de documento: Artigo Científico
Fonte: AMBIENT INTELLIGENCE IN HEALTH CARE, ICAIHC 2022; v. 317, p. 10-pg., 2023-01-01.
Resumo

During the COVID-19 pandemic, several genetic mutations occurred in the SARS-CoV-2 virus, making more infectious or transmissible. The World Health Organization (WHO) tracks and classifies variants as variants of concern (VOCs) or variants of interest (VOIs), depending on the level of transmissibility and dominance of the variant in the regions. The classification and identification of variants usually occur through sequence alignment techniques, which are computationally complex, making them unfeasible to classify thousands of sequences simultaneously. In this work, an application of the alignment-free method BASiNETEntropy is proposed for the classification of the variants of concern of SARS-CoV-2. The method initially maps the biological sequences as a complex network. From this, the most informative edges are selected through the entropy maximization principle, getting a filtered network containing only the most informative edges. Thus, complex network topological measurements are extracted and used as features vectors in the classification process. Sequences of SARS-CoV-2 variants of concern extracted from NCBI were used to assess the method. Experimental results show that extracted features can classify the variants of concern with high assertiveness, considering few features, contributing to the reduction of the feature space. Besides classifying the variants of concern, unique patterns (motifs) were also extracted for each variant, relative to the SARS-CoV-2 reference sequence. The proposed method is implemented as an open source in R language and is freely available at https://cran.r-project.org/web/packages/BASiNETEntropy/. (AU)

Processo FAPESP: 15/22308-2 - Representações intermediárias em Ciência Computacional para descoberta de conhecimento
Beneficiário:Roberto Marcondes Cesar Junior
Modalidade de apoio: Auxílio à Pesquisa - Temático