2025
Authors
Faria, JP; Trigo, E; Abreu, R;
Publication
FUNDAMENTALS OF SOFTWARE ENGINEERING, FSEN 2025
Abstract
Recent verification tools aim to make formal verification more accessible for software engineers by automating most of the verification process. However, the manual work and expertise required to write verification helper code, such as loop invariants and auxiliary lemmas and assertions, remains a barrier. This paper explores the use of Large Language Models (LLMs) to automate the generation of loop invariants for programs in Dafny. We tested the approach on a curated dataset of 100 programs in Dafny involving arrays, strings, and numeric types. Using a multimodel approach that combines GPT-4o and Claude 3.5 Sonnet, correct loop invariants (passing the Dafny verifier) were generated at the first attempt for 92% of the programs, and in at most five attempts for 95% of the programs. Additionally, we developed an extension to the Dafny plugin for Visual Studio Code to incorporate automatic loop invariant generation into the IDE. Our work stands out from related approaches by handling a broader class of problems and offering IDE integration.
2025
Authors
Ferreira, M; Viegas, L; Faria, JP; Lima, B;
Publication
2025 IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATION OF SOFTWARE TEST, AST
Abstract
Large language model (LLM)-powered assistants are increasingly used for generating program code and unit tests, but their application in acceptance testing remains underexplored. To help address this gap, this paper explores the use of LLMs for generating executable acceptance tests for web applications through a two-step process: (i) generating acceptance test scenarios in natural language (in Gherkin) from user stories, and (ii) converting these scenarios into executable test scripts (in Cypress), knowing the HTML code of the pages under test. This two-step approach supports acceptance test-driven development, enhances tester control, and improves test quality. The two steps were implemented in the AutoUAT and Test Flow tools, respectively, powered by GPT-4 Turbo, and integrated into a partner company's workflow and evaluated on real-world projects. The users found the acceptance test scenarios generated by AutoUAT helpful 95% of the time, even revealing previously overlooked cases. Regarding Test Flow, 92% of the acceptance test cases generated by Test Flow were considered helpful: 60% were usable as generated, 8% required minor fixes, and 24% needed to be regenerated with additional inputs; the remaining 8% were discarded due to major issues. These results suggest that LLMs can, in fact, help improve the acceptance test process, with appropriate tooling and supervision.
2025
Authors
Marchesi, L; Goldman, A; Lunesu, MI; Przybylek, A; Aguiar, A; Morgan, L; Wang, X; Pinna, A;
Publication
XP Workshops
Abstract
2025
Authors
Ferreira Ribeiro, JE; Silva, JG; Aguiar, A;
Publication
IEEE Access
Abstract
The development of safety-critical systems is heavily governed by domain-specific standards. In the aerospace industry, the DO-178C - Software Considerations in Airborne Systems and Equipment Certification - serves as the primary certification standard used by agencies such as the FAA and EASA to review and approve software-based systems. Although DO-178C aims to ensure system safety while providing evidence for certification, it does not prescribe a specific software development process, allowing flexibility for traditional Waterfall, Agile, or hybrid methods with appropriate adaptations for the aerospace context. This study proposes Scrum4DO178C, an Agile process based on Scrum, to meet the demanding requirements of aerospace software, including safety, robustness, reliability, and integrity. Scrum4DO178C introduces novel process enhancements specifically tailored to meet these criticality needs, while aligning with the standard. Unlike previous proposals that lack detail, this research presents a comprehensive, validated process applied in a real-world industry project at the highest criticality level (Level A - Catastrophic), offering insights beyond theoretical scenarios. The findings demonstrated that the Scrum4DO178C process improves project performance, allows frequent and manageable requirement changes, reduces Verification & Validation (V&V) effort, and increases efficiency while maintaining full compliance with DO-178C. The study also identifies areas for further improvement and suggests exploring the process in additional case studies, both within the aerospace industry and other domains with similarly stringent safety-critical requirements. Finally, it confirms that appropriate automation, namely for documentation production, is a central element to further improve the process. © 2013 IEEE.
2025
Authors
Peter, S; Kropp, M; Aguiar, A; Anslow, C; Lunesu, MI; Pinna, A;
Publication
XP
Abstract
2025
Authors
Lemos, D; Aguiar, A; Harrison, NB;
Publication
CoRR
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.