Advanced search
Start date
Betweenand

A 5D approach to the calculation of protein structure

Grant number: 17/22465-6
Support type:Regular Research Grants
Duration: May 01, 2018 - April 30, 2020
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Carlile Campos Lavor
Grantee:Carlile Campos Lavor
Home Institution: Instituto de Matemática, Estatística e Computação Científica (IMECC). Universidade Estadual de Campinas (UNICAMP). Campinas , SP, Brazil

Abstract

The calculation of the 3D structure of a protein molecule, using distances between nearby atoms from Nuclear Magnetic Resonance (NMR) experiments, is a fundamental problem related to the complex and costly process of development of new drugs by the pharmaceutical industry. The problem is NP-difficult, known in the literature by Molecular Distance Geometry Problem (MDGP). Unlike traditional methods (based on continuous optimization), we are working on a combinatorial model, based on stiffness properties of the graph associated with the problem (each vertex is related to an atom of the protein and when the distance is known between two atoms, we define a edge between the respective vertices, with weight given by the distance value). To solve the MDGP is to obtain an immersion of the graph in the 3D space, in such a way that the calculated Euclidean distances between pairs of atoms are equal to the weights of the corresponding edges. Considering that the distances of the NMR experiments are precise values, the combinatorial approach allows the problem search space to be represented by a binary tree, where an exact Branch & Prune (BP) method is defined to explore the tree for solutions, connected by symmetries that characterize each instance of the MDGP. The first attempt to generalize these results, including the uncertainties of the experimental data (with the distances being represented by "interval" values), was to consider samples at the associated intervals. This idea made the BP algorithm a heuristic, since it can no longer be guaranteed that a solution will be found. Refined samples exponentially increase the search space (the associated tree is no longer binary), and even then, the solution can be "lost" if the correct distance is between the values selected by the sampling process. To maintain the properties of the combinatorial approach and at the same time to consider the interval distances of the experimental data, we are proposing to represent the protein molecule in a space of 5 dimensions (Conformal Space), using a language more powerful than Linear Algebra: Geometric Algebra (also called Clifford's Algebra). The Conforming Model can be seen as an extension of the Projective Model, which uses homogeneous coordinates (4 dimensions), widely used in computational geometry problems. Due to the interdisciplinary character of this proposal, in addition to a collaboration already begun with researchers from French institutions (École Polytechnique, Institut Pasteur and Université de Rennes 1), we will also have the participation of researchers from the University of Colorado, University of Cambridge and Universitat Politècnica de Catalunya . (AU)