Scholarship 11/20451-1 - Inteligência artificial - BV FAPESP
Advanced search
Start date
Betweenand

Induction of Topic-Based Bayesian Networks from Text for the Prediction of Sugar Cane Yields

Grant number: 11/20451-1
Support Opportunities:Scholarships in Brazil - Post-Doctoral
Start date: August 01, 2013
End date: July 31, 2016
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Alneu de Andrade Lopes
Grantee:Brett Mylo Drury
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil

Abstract

Textual information can contain timely information which can aid business decisions. Textual sources can represent information which is difficult to represent in other formats. This information can be difficult to interpret because: 1. natural language can be ambiguous or contradictory, 2. natural language can contain: euphemisms and invented language and 3. single concepts can be represented by many words or terms.The field of Natural language Processing provides a number of methodologies to address this problem. Topic detection and sentiment classification are the most relevant methodologies for extracting information from textual sources, but they have their flaws. Topic detection can identify latent topics in a document collection, but the methodology does not contain any mechanism for applying the information to a specific problem. Sentiment classification and in particular feature based sentiment analysis allows the targeting of emotions in text to a specific feature of a product or target entity. Feature based sentiment analysis does not determine the importance or rank of the feature. For example, it is not possible to determine if a performance of a car is less or more important than its fuel consumption.This project seeks to advance the state of the art in extracting and ranking information in text to make inferences about an external problem. The external problem is the prediction of yields of future crop harvests. Agricultural news contains information from which an informed forecast of harvest yields can be made, for example weather reports or pest numbers. Textual information contains topics which are groups of words which are statistically related. The information which is contained in these topics and its direct relationship with crop yields is currently unknown. The project will seek to model these relationships by constructing a structured model of the specific domains from topics contained in agricultural news. The topics will contain features and events / sentiment which can be assigned to these features and therefore the topic can be labelled as negative or positive. The structured model will allow the prediction of relationships between topics. For example a certain sequence of weather conditions may increase the pest population. The structured model and its inferred relations can be used to make inferences specific to crop yields.This proposed approach addresses some weaknesses in the application of structured methods to predict crop yields. The literature review conducted for this project concluded that crop prediction Bayesian Networks are constructed from previously known factors; for example: weather or pesticide spraying regimes. The proposed project seeks to learn Bayesian Networks directly from news text through the identification of topics and their interrelations. This unsupervised / semi-supervised approach will allow identification of latent factors which may improve the predictive capability of a Bayesian Network.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
DRURY, BRETT; VALVERDE-REBAZA, JORGE; MOURA, MARIA-FERNANDA; LOPES, ALNEU DE ANDRADE. A survey of the applications of Bayesian networks in agriculture. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, v. 65, p. 29-42, . (13/12191-5, 11/20451-1, 15/14228-9)
VALEJO, ALAN; VALVERDE-REBAZA, JORGE; DRURY, BRETT; LOPES, ALNEU DE ANDRADE; ALMEIDA, A; BERNARDINO, J; GOMES, EF. Multilevel refinement based on neighborhood similarity. PROCEEDINGS OF THE 18TH INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM (IDEAS14), v. N/A, p. 10-pg., . (11/20451-1, 11/22749-8, 13/12191-5)