Publicacoes - INESC TEC

Publicações

Publicações por HumanISE

2025

LLM Prompt Engineering for Automated White-Box Integration Test Generation in REST APIs

Autores
Rincon, AM; Rizzo Vincenzi, AM; Faria, JP;

Publicação
IEEE International Conference on Software Testing, Verification and Validation, ICST 2025 - Workshops, Naples, Italy, March 31 - April 4, 2025

Abstract
This study explores prompt engineering for automated white-box integration testing of RESTful APIs using Large Language Models (LLMs). Four versions of prompts were designed and tested across three OpenAI models (GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4o) to assess their impact on code coverage, token consumption, execution time, and financial cost. The results indicate that different prompt versions, especially with more advanced models, achieved up to 90% coverage, although at higher costs. Additionally, combining test sets from different models increased coverage, reaching 96% in some cases. We also compared the results with EvoMaster, a specialized tool for generating tests for REST APIs, where LLM-generated tests achieved comparable or higher coverage in the benchmark projects. Despite higher execution costs, LLMs demonstrated superior adaptability and flexibility in test generation. © 2025 IEEE.

FecharLer Abstract

2025

Automated Social Media Feedback Analysis for Software Requirements Elicitation: A Case Study in the Streaming Industry

Autores
Silva, M; Faria, JP;

Publicação
Proceedings of the 20th International Conference on Evaluation of Novel Approaches to Software Engineering, ENASE 2025, Porto, Portugal, April 4-6, 2025.

Abstract

2025

Automatic Generation of Loop Invariants in Dafny with Large Language Models

Autores
Faria, JP; Trigo, E; Abreu, R;

Publicação
Fundamentals of Software Engineering - 11th IFIP WG 2.2 International Conference, FSEN 2025, Västerås, Sweden, April 7-8, 2025, Proceedings

Abstract
Recent verification tools aim to make formal verification more accessible for software engineers by automating most of the verification process. However, the manual work and expertise required to write verification helper code, such as loop invariants and auxiliary lemmas and assertions, remains a barrier. This paper explores the use of Large Language Models (LLMs) to automate the generation of loop invariants for programs in Dafny. We tested the approach on a curated dataset of 100 programs in Dafny involving arrays, strings, and numeric types. Using a multimodel approach that combines GPT-4o and Claude 3.5 Sonnet, correct loop invariants (passing the Dafny verifier) were generated at the first attempt for 92% of the programs, and in at most five attempts for 95% of the programs. Additionally, we developed an extension to the Dafny plugin for Visual Studio Code to incorporate automatic loop invariant generation into the IDE. Our work stands out from related approaches by handling a broader class of problems and offering IDE integration. © IFIP International Federation for Information Processing 2025.

FecharLer Abstract

2025

Agile Processes in Software Engineering and Extreme Programming - Workshops - XP 2024 Workshops, Bozen-Bolzano, Italy, June 4-7, 2024, Revised Selected Papers

Autores
Marchesi, L; Goldman, A; Lunesu, MI; Przybylek, A; Aguiar, A; Morgan, L; Wang, X; Pinna, A;

Publicação
XP Workshops

Abstract

2025

Scrum4DO178C: An Agile Process to Enhance Aerospace Software Development for DO-178C Compliance - A Case Study at Criticality Level A

Autores
Ferreira Ribeiro, JE; Silva, JG; Aguiar, A;

Publicação
IEEE Access

Abstract
The development of safety-critical systems is heavily governed by domain-specific standards. In the aerospace industry, the DO-178C - Software Considerations in Airborne Systems and Equipment Certification - serves as the primary certification standard used by agencies such as the FAA and EASA to review and approve software-based systems. Although DO-178C aims to ensure system safety while providing evidence for certification, it does not prescribe a specific software development process, allowing flexibility for traditional Waterfall, Agile, or hybrid methods with appropriate adaptations for the aerospace context. This study proposes Scrum4DO178C, an Agile process based on Scrum, to meet the demanding requirements of aerospace software, including safety, robustness, reliability, and integrity. Scrum4DO178C introduces novel process enhancements specifically tailored to meet these criticality needs, while aligning with the standard. Unlike previous proposals that lack detail, this research presents a comprehensive, validated process applied in a real-world industry project at the highest criticality level (Level A - Catastrophic), offering insights beyond theoretical scenarios. The findings demonstrated that the Scrum4DO178C process improves project performance, allows frequent and manageable requirement changes, reduces Verification & Validation (V&V) effort, and increases efficiency while maintaining full compliance with DO-178C. The study also identifies areas for further improvement and suggests exploring the process in additional case studies, both within the aerospace industry and other domains with similarly stringent safety-critical requirements. Finally, it confirms that appropriate automation, namely for documentation production, is a central element to further improve the process. © 2013 IEEE.

FecharLer Abstract

2025

Agile Processes in Software Engineering and Extreme Programming - 26th International Conference on Agile Software Development, XP 2025, Brugg-Windisch, Switzerland, June 2-5, 2025, Proceedings

Autores
Peter, S; Kropp, M; Aguiar, A; Anslow, C; Lunesu, MI; Pinna, A;

Publicação
XP

Abstract