The processing of large-scale graphs - so-called planetary-scale, or Web-scale, graphs - has attracted the interest of several applications such as e-commerce, computer networks, social networks, biology (protein interaction), among many others. In order to process these graphs, which are composed of millions of vertices and billions of edges, it has become usual to use distributed processing organized in computing clusters managed by frameworks like Hadoop. The problem with this approach is the fact that the demands to build and manage such clusters, many times, bring complexities bigger that the ones necessary to process and analyze the graphs of interest. In this scenario, one can observe a deviation of efforts because the means require more work than the final goal. This way, it is desirable the capacity of processing planetary-scale graphs in a single computing node. To this end, in this proposal, we will work with vertex-centric techniques combined with asynchronous parallel processing. The vertex-centric techniques allow that local properties of a given graph be calculated without having the entire graph traversed; in turn, asynchronous parallel processing allow that a graph that occupies Terabytes in a single storage disk be processed gradually in steps that can be held in main memory. With these techniques combined, we intend to develop new algorithms able to calculate proprieties yet not explored in large graphs over a single processing node; such properties will be used in the definition of an analysis framework able to reveal patterns, comprehension, and decision making.
News published in Agência FAPESP Newsletter about the scholarship: