Advanced search
Start date

W-operator learning using linear models for both gray-level and binary inputs

Full text
Igor dos Santos Montagner
Total Authors: 1
Document type: Doctoral Thesis
Press: São Paulo.
Institution: Universidade de São Paulo (USP). Instituto de Matemática e Estatística
Defense date:
Examining board members:
Roberto Hirata Junior; Stéphane Canu; André Carlos Ponce de Leon Ferreira de Carvalho; Roberto Marcondes Cesar Junior; Alexandre Xavier Falcão
Advisor: Roberto Hirata Junior; Nina Sumiko Tomita Hirata

Image Processing techniques can be used to solve a broad range of problems, such as medical imaging, document processing and object segmentation. Image operators are usually built by combining basic image operators and tuning their parameters. This requires both experience in Image Processing and trial-and-error to get the best combination of parameters. An alternative approach to design image operators is to estimate them from pairs of training images containing examples of the expected input and their processed versions. By restricting the learned operators to those that are translation invariant and locally defined ($W$-operators) we can apply Machine Learning techniques to estimate image transformations. The shape that defines which neighbors are used is called a window. $W$-operators trained with large windows usually overfit due to the lack sufficient of training data. This issue is even more present when training operators with gray-level inputs. Although approaches such as the two-level design, which combines multiple operators trained on smaller windows, partly mitigates these problems, they also require more complicated parameter determination to achieve good results. In this work we present techniques that increase the window sizes we can use and decrease the number of manually defined parameters in $W$-operator learning. The first one, KA, is based on Support Vector Machines and employs kernel approximations to estimate image transformations. We also present adequate kernels for processing binary and gray-level images. The second technique, NILC, automatically finds small subsets of operators that can be successfully combined using the two-level approach. Both methods achieve competitive results with methods from the literature in two different application domains. The first one is a binary document processing problem common in Optical Music Recognition, while the second is a segmentation problem in gray-level images. The same techniques were applied without modification in both domains. (AU)

FAPESP's process: 11/23310-0 - Automatic design of image operators: extension and contextualization to not necessarily boolean lattices
Grantee:Igor dos Santos Montagner
Support type: Scholarships in Brazil - Doctorate (Direct)