Analysis and extraction of bibliometrics in Big Data

Grant number: 15/20060-3
Support type:Scholarships in Brazil - Scientific Initiation
Effective date (Start): October 01, 2015
Effective date (End): September 30, 2016
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Carlos Roberto Valêncio
Grantee:José Carlos de Freitas
Home Institution: Instituto de Biociências, Letras e Ciências Exatas (IBILCE). Universidade Estadual Paulista (UNESP). Campus de São José do Rio Preto. São José do Rio Preto , SP, Brazil


Co-authorship networks in scientific productions make up an important resource on characterization of science within a community and their associations [MenaChalco, 2014]. Such characterization is made by Bibliometrics - the study of the quantitative aspects of the production, dissemination and use of information, developing models and mathematical patterns [Soft-Chapula, 1998]. In view of such importance, this work is the development of an environment for data mining on academic and scientific productions through co-authorship networks. To achieve this, the proposed environment consists of: 1) A unified set based on graph database, representing the co-authorship network; 2) Data import algorithms, aiming to extract data from sources related to scientific production, storing them in a graph database; 3) An Extraction tool of bibliometrics by applying analysis and graph theory.This paper proposes to contribute to the creation of a tool for data mining based on the application of analysis and graph theory using NoSQL graph database. It is expected that the environment created allows extraction of useful information related to the degree of interaction between the researchers and their respective productions, which allow the characterization of groups of researchers, universities, graduate programs and even characterize the science nationally.