Advanced search
Start date
Betweenand
(Reference retrieved automatically from SciELO through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

RECOGNIZING THE VOCABULARY OF BRAZILIAN POPULAR NEWSPAPERS WITH A FREE-ACCESS COMPUTATIONAL DICTIONARY

Full text
Author(s):
Maria José Bocorny FINATTO [1] ; Oto Araújo VALE [2] ; Éric LAPORTE [3]
Total Authors: 3
Affiliation:
[1] Universidade Federal do Rio Grande do Sul. Programa de Pós-Graduação em Letras - Brasil
[2] Universidade Federal de São Carlos. Centro de Educação e Ciências Humanas - Brasil
[3] Université Paris-Est. Institut d’électronique et d’informatique Gaspard-Monge - França
Total Affiliations: 3
Document type: Journal article
Source: Alfa, rev. linguíst. (São José Rio Preto); v. 63, n. 1, p. 63-80, 2019-05-30.
Abstract

ABSTRACT We report an experiment to check the identification of a set of words in popular written Portuguese with two versions of a computational dictionary of Brazilian Portuguese, DELAF PB 2004 and DELAF PB 2015. This dictionary is freely available for use in linguistic analyses of Brazilian Portuguese and other researches, which justifies critical study. The vocabulary comes from the PorPopular corpus, made of popular newspapers Diário Gaúcho (DG) and Massa ! (MA). From DG, we retained a set of texts with 984.465 words (tokens), published in 2008, with the spelling used before the Portuguese Language Orthographic Agreement adopted in 2009. From MA, we examined papers of 2012, 2014 e 2015, with 215.776 words (tokens), all with the new spelling. The checking involved: a) generating lists of words (types) occurring in DG and MA; b) comparing them with the entry lists of both versions of DELAF PB; c) assessing the coverage of this vocabulary; d) proposing ways of incorporating the items not covered. The results of the work show that an average of 19% of the types in DG were not found in DELAF PB 2004 or 2015. In MA, this average is 13%. Switching versions of the dictionary affected slightly the performance in recognizing the words. (AU)

FAPESP's process: 16/24670-3 - One or two portraits of predication in Portuguese
Grantee:Oto Araújo Vale
Support Opportunities: Regular Research Grants