Advanced search
Start date
Betweenand

Semantic role labeling in financial market tweets: format definition and lexical resource reuse

Grant number: 25/07948-7
Support Opportunities:Scholarships in Brazil - Scientific Initiation
Start date: August 01, 2025
End date: July 31, 2026
Field of knowledge:Linguistics, Literature and Arts - Linguistics - Linguistic Theory and Analysis
Principal Investigator:Ariani Di Felippo
Grantee:Pedro Henrique Silva
Host Institution: Centro de Educação e Ciências Humanas (CECH). Universidade Federal de São Carlos (UFSCAR). São Carlos , SP, Brazil

Abstract

The relevance of "user-generated content" (UGC) on social media has motivated the creation of annotated corpora for the development of tools capable of processing UGC (e.g., taggers and parsers). For Portuguese, a standout resource is DANTEStocks, with ~4,000 tweets (now X posts) about the financial market. It is the first tweebank annotated according to the Universal Dependencies (UD) grammatical framework. This project aims to add a semantic role annotation layer to support the future development of Semantic Role Labeling (SRL) methods, which seek to identify core informational content (who did what, to whom, where, when, etc.) in utterances. The starting point will be the NounBank.DS repository in which the nominal predicates of DANTEStocks are described following the English NomBank project (derived from the widely adopted PropBank). Following NomBank, the description of the nominal predications in NounBank.DS is characterized by a reduced tagset for core arguments (Arg0 to Arg5) and a broader one for modifiers, with all labels added to the syntactic annotation layer of the corpora. To convert the NounBank.DS data (currently in HTML and JSON formats) into a semantic annotation layer for DANTEStocks, both manual and especially semi-automatic methods based on current Prompt Engineering techniques will be explored. Prior to that, however, it will be necessary to define a semantic annotation format and a file format that integrates UD annotation with semantic roles.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)