Advanced search
Start date

Organization of open government data based on hierarchies of topics

Grant number: 12/01617-9
Support Opportunities:Scholarships in Brazil - Scientific Initiation
Effective date (Start): May 01, 2012
Effective date (End): April 30, 2013
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Solange Oliveira Rezende
Grantee:Daniel Luiz de Albuquerque
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil


In order to promote greater transparency in politics today, movements, mostly Americans, along with new proposals of the Obama administration, have ratified such ideas, which are slowly spreading around the world, including Brazil, which already has movements opening government data. This was the inspiration for writing this project. Open government data follow some rules in its publication, which allows full manipulation and guarantee facilities in use. In other cases, documents are only public, with future projects of opening data. In those cases the publication in the network is done without any rule or pattern, suppressing initiatives to use these data. In general, government data are in a crude format, which makes it difficult for both the visualization and the public interest in consulting such content. Associated with such difficulty, is verified the progressive growth in the amount of information, which makes the human analysis an impractical task. Against the perception of a need for greater accessibility to these data, there is text mining, with whom it is possible to organize the content and extract knowledge. The project consists of five phases, from the process of text mining. Firstly, should be done the collecting and processing of documents in raw format. After the collecting phase, data are inappropriate for the pattern extraction phase, which requires a good treatment that will deprive, as far as possible, the remainder of the process against inconsistencies and unsatisfactory results. In the pattern extraction, the results become more evident. The topic hierarchies are builder which defines the dependencies between terms and documents. This organization of data also allows an improvement in the search process, thereby encouraging the use of information.(AU)

News published in Agência FAPESP Newsletter about the scholarship:
Articles published in other media outlets (0 total):
More itemsLess items

Please report errors in scientific publications list by writing to: