On the prediction of long-lived bugs: An analysis and comparative study using FLOSS projects

Ferreira Gomes, Luiz Alberto; Torres, Ricardo da Silva; Cortes, Mario Lucio

Full text
Author(s):	Ferreira Gomes, Luiz Alberto ^{[1, 2]} ; Torres, Ricardo da Silva ^[3] ; Cortes, Mario Lucio ^[1] Total Authors: 3
Affiliation:	^[1] Univ Estadual Campinas, Inst Comp IC, UNICAMP, Campinas, SP - Brazil ^[2] Pontifical Catholic Univ Minas Gerais PUC MG, Inst Exact Sci & Informat ICEI, Belo Horizonte, MG - Brazil ^[3] Norwegian Univ Sci & Technol NTNU, Dept ICT & Nat Sci, Trondheim - Norway Total Affiliations: 3
Document type:	Journal article
Source:	INFORMATION AND SOFTWARE TECHNOLOGY; v. 132, APR 2021.
Web of Science Citations:	0
Abstract
Context: Software evolution and maintenance activities in today's Free/Libre Open Source Software (FLOSS) rely primarily on information extracted from bug reports registered in bug tracking systems. Many studies point out that most bugs that adversely affect the user's experience across versions of FLOSS projects are longlived bugs. However, proposed approaches that support bug fixing procedures do not consider the real-world lifecycle of a bug, in which bugs are often fixed very fast. This may lead to useless efforts to automate the bug management process. Objective: This study aims to confirm whether the number of long-lived bugs is significantly high in popular open-source projects and to characterize the population of long-lived bugs by considering the attributes of bug reports. We also aim to conduct a comparative study evaluating the prediction accuracy of five well-known machine learning algorithms and text mining techniques in the task of predicting long-lived bugs. Methods: We collected bug reports from six popular open-source projects repositories (Eclipse, Freedesktop, Gnome, GCC, Mozilla, and WineHQ) and used the following machine learning algorithms to predict long-lived bugs: K-Nearest Neighbor, Naive Bayes, Neural Networks, Random Forest, and Support Vector Machines. Results: Our results show that long-lived bugs are relatively frequent (varying from 7.2% to 40.7%) and have unique characteristics, confirming the need to study solutions to support bug fixing management. We found that the Neural Network classifier yielded the best results in comparison to the other algorithms evaluated. Conclusion: Research efforts regarding long-lived bugs are needed and our results demonstrate that it is possible to predict long-lived bugs with a high accuracy (around 70.7%) despite the use of simple prediction algorithms and text mining methods. (AU)

FAPESP's process:	14/50715-9 - Characterizing and predicting biomass production in sugarcane and eucalyptus plantations in Brazil
Grantee:	Rubens Augusto Camargo Lamparelli
Support Opportunities:	Research Grants - Research Partnership for Technological Innovation - PITE


FAPESP's process:	16/50250-1 - The secret of playing football: Brazil versus the Netherlands
Grantee:	Sergio Augusto Cunha
Support Opportunities:	Research Projects - Thematic Grants


FAPESP's process:	14/12236-1 - AnImaLS: Annotation of Images in Large Scale: what can machines and specialists learn from interaction?
Grantee:	Alexandre Xavier Falcão
Support Opportunities:	Research Projects - Thematic Grants


FAPESP's process:	15/24494-8 - Communications and processing of big data in cloud and fog computing
Grantee:	Nelson Luis Saldanha da Fonseca
Support Opportunities:	Research Projects - Thematic Grants


FAPESP's process:	17/20945-0 - Multi-user equipment approved in great 16/50250-1: local positioning system
Grantee:	Sergio Augusto Cunha
Support Opportunities:	Multi-user Equipment Program


FAPESP's process:	13/50155-0 - Combining new technologies to monitor phenology from leaves to ecosystems
Grantee:	Leonor Patricia Cerdeira Morellato
Support Opportunities:	Research Program on Global Climate Change - University-Industry Cooperative Research (PITE)

Short URL