2024
Authors
Pereira, A; Lima, B; Faria, JP;
Publication
CoRR
Abstract
2024
Authors
Bertolino, A; Pascoal Faria, J; Lago, P; Semini, L;
Publication
Communications in Computer and Information Science
Abstract
2025
Authors
Rincon, AM; Rizzo Vincenzi, AM; Faria, JP;
Publication
IEEE International Conference on Software Testing, Verification and Validation, ICST 2025 - Workshops, Naples, Italy, March 31 - April 4, 2025
Abstract
This study explores prompt engineering for automated white-box integration testing of RESTful APIs using Large Language Models (LLMs). Four versions of prompts were designed and tested across three OpenAI models (GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4o) to assess their impact on code coverage, token consumption, execution time, and financial cost. The results indicate that different prompt versions, especially with more advanced models, achieved up to 90% coverage, although at higher costs. Additionally, combining test sets from different models increased coverage, reaching 96% in some cases. We also compared the results with EvoMaster, a specialized tool for generating tests for REST APIs, where LLM-generated tests achieved comparable or higher coverage in the benchmark projects. Despite higher execution costs, LLMs demonstrated superior adaptability and flexibility in test generation. © 2025 IEEE.
2025
Authors
Silva, M; Faria, JP;
Publication
Proceedings of the 20th International Conference on Evaluation of Novel Approaches to Software Engineering, ENASE 2025, Porto, Portugal, April 4-6, 2025.
Abstract
2025
Authors
Faria, JP; Trigo, E; Abreu, R;
Publication
Fundamentals of Software Engineering - 11th IFIP WG 2.2 International Conference, FSEN 2025, Västerås, Sweden, April 7-8, 2025, Proceedings
Abstract
Recent verification tools aim to make formal verification more accessible for software engineers by automating most of the verification process. However, the manual work and expertise required to write verification helper code, such as loop invariants and auxiliary lemmas and assertions, remains a barrier. This paper explores the use of Large Language Models (LLMs) to automate the generation of loop invariants for programs in Dafny. We tested the approach on a curated dataset of 100 programs in Dafny involving arrays, strings, and numeric types. Using a multimodel approach that combines GPT-4o and Claude 3.5 Sonnet, correct loop invariants (passing the Dafny verifier) were generated at the first attempt for 92% of the programs, and in at most five attempts for 95% of the programs. Additionally, we developed an extension to the Dafny plugin for Visual Studio Code to incorporate automatic loop invariant generation into the IDE. Our work stands out from related approaches by handling a broader class of problems and offering IDE integration. © IFIP International Federation for Information Processing 2025.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.