Advanced search
Start date
Betweenand


Text/non-text classification of connected components in document images

Full text
Author(s):
Julca-Aguilar, Frank D. ; Maia, Ana L. L. M. ; Hirata, Nina S. T. ; IEEE
Total Authors: 4
Document type: Journal article
Source: 2017 30TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI); v. N/A, p. 6-pg., 2017-01-01.
Abstract

Text segmentation is an important problem in document analysis related applications. We address the problem of classifying connected components of a document image as text or non-text. Inspired from previous works in the literature, besides common size and shape related features extracted from the components, we also consider component images, without and with context information, as inputs of the classifiers. Multi-layer perceptrons and convolutional neural networks are used to classify the components. High precision and recall is obtained with respect to both text and non-text components. (AU)

FAPESP's process: 15/17741-9 - Combination of local and global features in image operator learning
Grantee:Nina Sumiko Tomita Hirata
Support Opportunities: Regular Research Grants