Advanced search
Start date

Semantically Enriched Representations for Portuguese TextMining: Models and Applications


Text Mining techniques have become essential for supporting text analysis and knowledge discovery as the volume and variety of digital text documents have increased, either in social networks and the Web or inside organizations. Despite the application task or applied technique, the treatment of text semantics is an important challenge of the Text Mining process. The challenge is even bigger when we analyze Portuguese texts due to language particularities and the low number of Portuguese resources and researches. In this context, this project aims to advance Text Mining research, focusing on the Portuguese language, and disseminate the knowledge of the field by applying Text Mining techniques in different real-world problems. We will investigate and propose semantically enriched text representation models, considering both the vector-space model and network-based representations, as well as their application in one-class learning. As a first step to support this research, we will collect, prepare and characterize collections of texts written in Portuguese, and make consolidated information about labeled collections available to the research community. Lastly, we will evaluate and apply semantically enriched text representations in different Text Mining problems, such as sentiment analysis, recommendation systems, fake news detection, literature-based discovery and event mining. (AU)