Advanced search
Start date
Betweenand


Enhancing Low-Cost Molecular Property Prediction with Contrastive Learning on SMILES Representations

Full text
Author(s):
Quiles, Marcos G. ; Ribeiro, Piero A. L. ; Pinheiro, Gabriel A. ; Prati, Ronaldo C. ; da Silva, Juarez L. F.
Total Authors: 5
Document type: Journal article
Source: COMPUTATIONAL SCIENCE AND ITS APPLICATIONS-ICCSA 2024 WORKSHOPS, PT IX; v. 14823, p. 15-pg., 2024-01-01.
Abstract

This paper explores self-supervised contrastive learning techniques in the Simplified Molecular Input Line Entry System (SMILES) representations. The contrastive model is responsible for pre-training the model before its downstream application on molecular property prediction. Central to our approach is the use of contrastive learning paradigms, in which pairs of distinct SMILES corresponding to the same molecule are treated as positive samples. In contrast, negative samples comprise pairs of SMILES that represent distinct molecules. This methodology aims to enrich the model's understanding of molecular structures by emphasizing the invariance of molecular properties despite permutation in the SMILES representation, thereby enhancing the robustness and predictive accuracy of the model. We conducted a series of experiments on publicly available and proprietary datasets to evaluate the efficacy of our approach. Empirical results underscore the merits of the contrastive learning model over traditional methods, demonstrating improvements in molecular property prediction tasks. This study not only validates the potential of contrastive learning in the realm of chemoinformatics, but also sets a promising direction for future research in leveraging SMILES permutations for molecular analysis. (AU)

FAPESP's process: 22/09285-7 - Chemical space exploration via semi-supervised learning for design of new materials
Grantee:Marcos Gonçalves Quiles
Support Opportunities: Regular Research Grants
FAPESP's process: 17/11631-2 - CINE: computational materials design based on atomistic simulations, meso-scale, multi-physics, and artificial intelligence for energy applications
Grantee:Juarez Lopes Ferreira da Silva
Support Opportunities: Research Grants - Research Centers in Engineering Program