Publications

Publications by Auri Vincenzi

2017

An Acceptance Empirical Assessment of Open Source Test Tools

Authors
Valentim, NMC; Lopes, A; César, E; Conte, T; Vincenzi, AMR; Maldonado, JC;

Publication
ICEIS: PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS - VOL 2

Abstract
Software testing is one of the verification and validation activities of software development process. Test automation is relevant, since manual application of tests is laborious and more prone to error. The choice of test tools should be based on criteria and evidence of their usefulness and ease of use. This paper presents an acceptance empirical assessment of open source testing tools. Practitioners and graduate students evaluated five tools often used in the industry. The results describe how these tools are perceived in terms of ease of use and usefulness. These results can support software practitioners in the process of choosing testing tools for their projects.

CloseRead Abstract

2023

An initial investigation of ChatGPT unit test generation capability

Authors
Guilherme, V; Vincenzi, A;

Publication
SAST

Abstract
Context: Software testing ensures software quality, but developers often disregard it. The use of automated testing generation is pursued to reduce the consequences of overlooked test cases in a software project. Problem: In the context of Java programs, several tools can completely automate generating unit test sets. Additionally, studies are conducted to offer evidence regarding the quality of the generated test sets. However, it is worth noting that these tools rely on machine learning and other AI algorithms rather than incorporating the latest advancements in Large Language Models (LLMs). Solution: This work aims to evaluate the quality of Java unit tests generated by an OpenAI LLM algorithm, using metrics like code coverage and mutation test score. Method: For this study, 33 programs used by other researchers in the field of automated test generation were selected. This approach was employed to establish a baseline for comparison purposes. For each program, 33 unit test sets were generated automatically, without human interference, by changing Open AI API parameters. After executing each test set, metrics such as code line coverage, mutation score, and success rate of test execution were collected to evaluate the efficiency and effectiveness of each set. Summary of Results: Our findings revealed that the OpenAI LLM test set demonstrated similar performance across all evaluated aspects compared to traditional automated Java test generation tools used in the previous research. These results are particularly remarkable considering the simplicity of the experiment and the fact that the generated test code did not undergo human analysis.

CloseRead Abstract

2023

An Experimental Study Evaluating Cost, Adequacy, and Effectiveness of Pynguin's Test Sets

Authors
Guerino, L; Vincenzi, A;

Publication
SAST

Abstract
Context: Software testing is a very relevant step in quality assurance, but developers frequently overlook it. We pursued testing automation to minimize the impact of missing test cases in a software project. Problem: However, for Python programs, there are not many tools able to fully automate the generation of unit test sets, and the one available demands studies to provide evidence of the quality of the generated test set. Solution: This work aims to evaluate the quality of different unit test generation algorithms for Python, implemented in a tool named Pynguin. Method: In the analysis of the selected programs, the Pynguin test generation tool is executed with each of its algorithms, including random, as a way to generate complete unit test sets. Then, we evaluate each generated test set's efficacy, efficiency, and cost. We use four different fault models, implemented by four mutation testing tools, to measure efficacy. We use line and branch coverage to measure efficiency, the number of test cases, and test set execution time to measure cost. Summary of Results: We identified that RANDOM test set performed worst concerning all evaluated aspects, DYNAMOSA and MOSA, the two algorithms that generate the best test sets regarding efficacy, efficiency, and cost. By combining all Pynguin smart algorithms (DYNAMOSA, MIO, MOSA, WHOLE-SUITE), the resultant test set overcomes the individual test sets efficiency by around 1%, for coverage and efficacy by 4.5% on average, concerning previous mutation score, at a reasonable cost, without a test set minimization.

CloseRead Abstract

2023

Mock objects: a case study in industry*

Authors
Ibarra, CHV; de Faria, DLC; Endo, AT; Beder, DM; Vincenzi, AMR;

Publication
PROCEEDINGS OF THE 19TH BRAZILIAN SYMPOSIUM ON INFORMATION SYSTEMS

Abstract
Context: Mock objects are commonly used in unit tests to isolate a class from its dependencies by substituting a dependency class instead of the original. Problem: However, in addition to the use of mock objects in the vast majority of OO applications, there are some discrepancies in their use. Solution: This work aims to present a case study of an industry application regarding the use of mock objects. The application in question is part of a private flight management system. IS Theory: We conceived this work based on the General Systems Theory, specifically to evaluate a microservicebased system’s use of testing doubles during testing automation. Method: In the analyzed application, mock object implementations are highly related to the system design. For example, the number of dependencies in a production class is directly related to the number of mocks in the respective test class. As a consequence, poor design choices are harmful to intrinsic quality factors such as the maintainability of the application tests. Our study uses metrics to analyze this practice. The research will consider factors of this system, aiming mainly to contribute to the improvement of test doubles by the test team. Summary of Results: Application of simulated objects in a similar system indicates that developers are making immature use of the technique, possibly due to system design problems. This study concludes that theory and practice are misaligned. Contributions and Impact in the IS area: The main contribution is to indicate points to improving software testing practice by using mocks correctly.

CloseRead Abstract

2021

Deep Reinforcement Learning based Android Application GUI Testing

Authors
Collins, EF; Dias Neto, AC; Vincenzi, A; Maldonado, JC;

Publication
SBES

Abstract
The advances in mobile computing and the market demand for new products which meet an increasingly public represent the importance to assure the quality of mobile applications. In this context, automated GUI testing has become highlighted in research. However, studies indicate that there are still limitations to achieve a large number of possible combinations of operations, transitions, functionality coverage, and failures reproduction. In this paper, a Deep Q-Network-based android application GUI testing tool (DeepGUIT) is proposed to test case generation for android mobile apps, guiding the exploration by code coverage value and new activities. The tool was evaluated with 15 open-source mobile applications. The obtained results showed higher code coverage than the state-of-the-art tools Monkey (61% average higher) and Q-testing (47% average higher), in addition, a greater number of failures.

CloseRead Abstract

2020

Reducing the Cost of Mutation Testing with the Use of Primitive Arcs Concept

Authors
Kuroishi, PH; Delamaro, ME; Maldonado, JC; Rizzo Vincenzi, AM;

Publication
SBQS

Abstract
Mutation testing is a testing criterion used to measure the quality of a test suite. In mutation, a test suite is executed against the set of mutants of a given program under testing. A score is computed to measure the adequacy of the test suite in detecting faults. Although powerful, mutation testing has two major drawbacks: The high-computational cost to generate and execute the set of generated mutants and the existence of equivalent mutants. In this paper, we present a preliminary experimental study to investigate the use of control-flow information, aiming to reduce the number of mutants. For this study, only a subset of mutants, defined by its location, is executed. Such location is determined by the set of primitive arcs of a given program under testing. Next, it is analyzed the relationship between minimal mutants and primitive arcs. Results indicate that the approach reduces the number of mutants and equivalent mutants and, in most cases, still maintains a high mutation score concerning full mutation. Moreover, the results also indicate that there is a concentration of minimal mutants on the nodes related to primitive arcs. Finally, we compare the effectiveness of our strategy over random mutant sampling.

CloseRead Abstract