Advanced search
Start date
Betweenand

Distance geometry and Clifford algebra for 3D protein structure calculation

Grant number: 19/20047-8
Support type:Scholarships abroad - Research
Effective date (Start): March 02, 2020
Effective date (End): March 01, 2021
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computational Mathematics
Principal Investigator:Carlile Campos Lavor
Grantee:Carlile Campos Lavor
Host: Jose Luis Aragon Vera
Home Institution: Instituto de Matemática, Estatística e Computação Científica (IMECC). Universidade Estadual de Campinas (UNICAMP). Campinas , SP, Brazil
Local de pesquisa : Universidad Nacional Autónoma de México, Juriquilla (UNAM), Mexico  

Abstract

The problem is the calculation of the 3D structure of a protein molecule using distances between close atoms from Nuclear Magnetic Resonance (NMR) experiments. This is a fundamental problem in the complex and costly process of new drug development by the pharmaceutical industry. It is an NP-hard problem, known in the literature as Molecular Distance Geometry Problem (MDGP). Unlike traditional methods (based on continuous optimization), we are working on a combinatorial model based on rigidity properties of the graph related to the problem (each vertex is related to one atom and when the distance is known between two atoms, we define an edge between the respective vertices, weighted by the distance value). In order to solve the MDGP is necessary to get an immersion of the associated graph in 3D space such that the Euclidean distances calculated between pairs of atoms are equal to the corresponding edge weights. For precised distance values, the combinatorial approach allows the problem search space to be represented by a binary tree, where an exact method, the Branch & Prune (BP), was designed to explore the tree for solutions, connected by symmetries that characterize each instance of the MDGP. However, considering the uncertainties of the experimental data (with distances being represented by interval of real numbers), the BP algorithm becomes a heuristic when samples over such intervals must be selected. As we refine the process, the search space increases exponentially and yet there is no guarantee that a solution will be found as the correct distance may have been "lost" during the sampling procedure. In order to maintain the properties of the combinatorial approach (highlighting the symmetries mentioned above) and, at the same time, to consider the "interval distances" of the experimental data, we are proposing to represent the protein molecule in a 5-dimensional space (the Conformal Space) using a language more powerful than Linear Algebra: the Clifford Algebra. The Conformal Space can be seen as an extension of the Projective Space, which uses homogeneous coordinates (4 dimensions), widely used in computational geometry problems.