Automated CNN optimization using multi-objective grammatical evolution

da Silva, Cleber A. C. F.; Rosa, Daniel Carneiro; Miranda, Pericles B. C.; Si, Tapas; Cerri, Ricardo; Basgalupp, Marcio P.

Full text
Author(s):	da Silva, Cleber A. C. F. ; Rosa, Daniel Carneiro ; Miranda, Pericles B. C. ; Si, Tapas ; Cerri, Ricardo ; Basgalupp, Marcio P. Total Authors: 6
Document type:	Journal article
Source:	APPLIED SOFT COMPUTING; v. 151, p. 10-pg., 2023-12-13.
Abstract
Selecting and optimizing Convolutional Neural Networks (CNNs) has become a very complex task given the number of associated optimizable parameters, as well as the fact that the arrangement of the layers present in a CNN directly influences its performance. Several research areas used automation techniques to construct and optimize these architectures, with Grammatical Evolution (GE) being one of the most promising techniques. Although several works proposed solutions to the problem in question, each adopts its own evaluation strategy (e.g., different datasets, evaluation metrics, hardware infrastructure). This divergence makes it difficult to compare the proposed approaches, and consequently, it is not possible to reach safe conclusions about the performance of the solutions. This work proposes an experimental evaluation of several context-free grammars listed in the literature for constructing and optimizing CNNs architectures. In addition, we included four well-known CNNs as baselines: DenseNet169, EfficientNetB1, InceptionV3 and ResNet50V2. We aim to identify the best practices for elaborating grammars and compare their results with consolidated CNNs for image classification problems in the literature. Besides, we assessed all approaches on the same controlled environment (e.g., datasets, evaluation metrics, software and hardware setup) to guarantee fairness in the evaluation process. The experiments were carried out by investigating the performance of the models generated by different grammars in solving image classification problems in three datasets of variable dimensions: CIFAR-10, EuroSAT, and MNIST. The experiments have validated several key findings: (i) the significance of optimizing Convolutional Neural Networks (CNNs); (ii) the potential of grammar-based methods as a promising alternative for this task, yielding CNN models that outperform state-of-the-art CNN architectures while possessing fewer trainable parameters, resulting in reduced computational complexity; (iii) grammars incorporating regularization layers (such as dropout and batch normalization) and those that confine the search space (via parameter constraints on CNNs) consistently produce high-performing models with lower complexity, even after a few generations of the evolutionary process; and (iv) the selection of the grammar for optimization can positively or negatively impact the model generation, depending on the specific task requirements. (AU)

FAPESP's process:	20/09835-1 - IARA - Artificial Intelligence in the Remaking of Urban Environments
Grantee:	André Carlos Ponce de Leon Ferreira de Carvalho
Support Opportunities:	Research Grants - Research Centers in Engineering Program


FAPESP's process:	22/07458-1 - Automatic selection and recommendation of machine learning algorithms
Grantee:	Márcio Porto Basgalupp
Support Opportunities:	Regular Research Grants

Short URL