2025
Authors
Rincon, AM; Vincenzi, AMR; Faria, JP;
Publication
2025 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS, ICSTW
Abstract
This study explores prompt engineering for automated white-box integration testing of RESTful APIs using Large Language Models (LLMs). Four versions of prompts were designed and tested across three OpenAI models (GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4o) to assess their impact on code coverage, token consumption, execution time, and financial cost. The results indicate that different prompt versions, especially with more advanced models, achieved up to 90% coverage, although at higher costs. Additionally, combining test sets from different models increased coverage, reaching 96% in some cases. We also compared the results with EvoMaster, a specialized tool for generating tests for REST APIs, where LLM-generated tests achieved comparable or higher coverage in the benchmark projects. Despite higher execution costs, LLMs demonstrated superior adaptability and flexibility in test generation.
2025
Authors
Silva, M; Faria, JP;
Publication
Proceedings of the 20th International Conference on Evaluation of Novel Approaches to Software Engineering, ENASE 2025, Porto, Portugal, April 4-6, 2025.
Abstract
2025
Authors
Faria, JP; Trigo, E; Abreu, R;
Publication
Fundamentals of Software Engineering - 11th IFIP WG 2.2 International Conference, FSEN 2025, Västerås, Sweden, April 7-8, 2025, Proceedings
Abstract
Recent verification tools aim to make formal verification more accessible for software engineers by automating most of the verification process. However, the manual work and expertise required to write verification helper code, such as loop invariants and auxiliary lemmas and assertions, remains a barrier. This paper explores the use of Large Language Models (LLMs) to automate the generation of loop invariants for programs in Dafny. We tested the approach on a curated dataset of 100 programs in Dafny involving arrays, strings, and numeric types. Using a multimodel approach that combines GPT-4o and Claude 3.5 Sonnet, correct loop invariants (passing the Dafny verifier) were generated at the first attempt for 92% of the programs, and in at most five attempts for 95% of the programs. Additionally, we developed an extension to the Dafny plugin for Visual Studio Code to incorporate automatic loop invariant generation into the IDE. Our work stands out from related approaches by handling a broader class of problems and offering IDE integration. © IFIP International Federation for Information Processing 2025.
2025
Authors
Marchesi, L; Goldman, A; Lunesu, MI; Przybylek, A; Aguiar, A; Morgan, L; Wang, X; Pinna, A;
Publication
XP Workshops
Abstract
2025
Authors
Ferreira Ribeiro, JE; Silva, JG; Aguiar, A;
Publication
IEEE Access
Abstract
The development of safety-critical systems is heavily governed by domain-specific standards. In the aerospace industry, the DO-178C - Software Considerations in Airborne Systems and Equipment Certification - serves as the primary certification standard used by agencies such as the FAA and EASA to review and approve software-based systems. Although DO-178C aims to ensure system safety while providing evidence for certification, it does not prescribe a specific software development process, allowing flexibility for traditional Waterfall, Agile, or hybrid methods with appropriate adaptations for the aerospace context. This study proposes Scrum4DO178C, an Agile process based on Scrum, to meet the demanding requirements of aerospace software, including safety, robustness, reliability, and integrity. Scrum4DO178C introduces novel process enhancements specifically tailored to meet these criticality needs, while aligning with the standard. Unlike previous proposals that lack detail, this research presents a comprehensive, validated process applied in a real-world industry project at the highest criticality level (Level A - Catastrophic), offering insights beyond theoretical scenarios. The findings demonstrated that the Scrum4DO178C process improves project performance, allows frequent and manageable requirement changes, reduces Verification & Validation (V&V) effort, and increases efficiency while maintaining full compliance with DO-178C. The study also identifies areas for further improvement and suggests exploring the process in additional case studies, both within the aerospace industry and other domains with similarly stringent safety-critical requirements. Finally, it confirms that appropriate automation, namely for documentation production, is a central element to further improve the process. © 2013 IEEE.
2025
Authors
Peter, S; Kropp, M; Aguiar, A; Anslow, C; Lunesu, MI; Pinna, A;
Publication
XP
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.