Busca avançada
Ano de início
Entree


Enhancing Low-Cost Molecular Property Prediction with Contrastive Learning on SMILES Representations

Texto completo
Autor(es):
Quiles, Marcos G. ; Ribeiro, Piero A. L. ; Pinheiro, Gabriel A. ; Prati, Ronaldo C. ; da Silva, Juarez L. F.
Número total de Autores: 5
Tipo de documento: Artigo Científico
Fonte: COMPUTATIONAL SCIENCE AND ITS APPLICATIONS-ICCSA 2024 WORKSHOPS, PT IX; v. 14823, p. 15-pg., 2024-01-01.
Resumo

This paper explores self-supervised contrastive learning techniques in the Simplified Molecular Input Line Entry System (SMILES) representations. The contrastive model is responsible for pre-training the model before its downstream application on molecular property prediction. Central to our approach is the use of contrastive learning paradigms, in which pairs of distinct SMILES corresponding to the same molecule are treated as positive samples. In contrast, negative samples comprise pairs of SMILES that represent distinct molecules. This methodology aims to enrich the model's understanding of molecular structures by emphasizing the invariance of molecular properties despite permutation in the SMILES representation, thereby enhancing the robustness and predictive accuracy of the model. We conducted a series of experiments on publicly available and proprietary datasets to evaluate the efficacy of our approach. Empirical results underscore the merits of the contrastive learning model over traditional methods, demonstrating improvements in molecular property prediction tasks. This study not only validates the potential of contrastive learning in the realm of chemoinformatics, but also sets a promising direction for future research in leveraging SMILES permutations for molecular analysis. (AU)

Processo FAPESP: 22/09285-7 - Exploração do espaço químico via aprendizado semissupervisionado para geração de novos materiais
Beneficiário:Marcos Gonçalves Quiles
Modalidade de apoio: Auxílio à Pesquisa - Regular
Processo FAPESP: 17/11631-2 - CINE: desenvolvimento computacional de materiais utilizando simulações atomísticas, meso-escala, multi-física e inteligência artificial para aplicações energéticas
Beneficiário:Juarez Lopes Ferreira da Silva
Modalidade de apoio: Auxílio à Pesquisa - Programa Centros de Pesquisa em Engenharia