Publications

Publications by HASLab

2026

Software Engineering and Formal Methods. SEFM 2024 Collocated Workshops - ReacTS 2024 and CIFMA 2024, Aveiro, Portugal, November 4-5, 2024, Revised Selected Papers

Authors
Proença, J; Fervari, R; Martins, MA; Kahle, R; Pluck, G;

Publication
SEFM

Abstract

2026

Software Testing Education and Industry Needs- Report from the ENACTEST EU Project

Authors
Saadatmand, M; Khan, A; Marin, B; Paiva, ACR; Van Asch, N; Moran, G; Cammaerts, F; Snoeck, M; Mendes, A;

Publication
PRODUCT-FOCUSED SOFTWARE PROCESS IMPROVEMENT. INDUSTRY, DOCTORAL-SYMPOSIUM, TUTORIAL, AND WORKSHOP PAPERS, PROFES 2025

Abstract
The evolving landscape of software development demands that software testers continuously adapt to new tools, practices, and acquire new skills. This study investigates software testing competency needs in industry, identifies knowledge gaps in current testing education, and highlights competencies and gaps not addressed in academic literature. This is done by conducting two focus group sessions and interviews with professionals across diverse domains, including railway industry, healthcare, and software consulting and performing a curated small-scale scoping review. The study instrument, co-designed by members of the ENACTEST project consortium, was developed collaboratively and refined through multiple iterations to ensure comprehensive coverage of industry needs and educational gaps. In particular, by performing a thematic qualitative analysis, we report our findings and observations regarding: professional training methods, challenges in offering training in industry, different ways of evaluating the quality of training, identified knowledge gaps with respect to academic education and industry needs, future needs and trends in testing education, and knowledge transfer methods within companies. Finally, the scoping review results confirm knowledge gaps in areas such as AI testing, security testing and soft skills.

CloseRead Abstract

2026

Teachers' Perspective on Software Testing Education

Authors
Fasolino, AR; MarIn, B; Vos, TEJ; Mendes, A; Paiva, ACR; Cammaerts, F; Snoeck, M; Saadatmand, M; Tramontana, P;

Publication
ACM TRANSACTIONS ON COMPUTING EDUCATION

Abstract
Context. Software testing is a critical aspect of the software development lifecycle, yet it remains underrepresented in academic curricula. Despite advances in pedagogical practices and increased attention from the academic community, challenges persist in effectively teaching software testing. Understanding these challenges from the teachers' perspective is crucial to aligning education with industry needs. Objective. To analyze the characteristics, practices, tools, and challenges of software testing courses in higher education, from the perspective of educators, and to assess the integration of recent pedagogical approaches in software testing education. Method. A structured survey consisting of 52 questions was distributed to 143 software testing educators across Western European universities, resulting in 49 valid responses. The survey explored topics taught, course organization, teaching practices, tools and materials used, gamification approaches, and teacher satisfaction. Results. The survey revealed significant variability in course content, structure, and teaching methods. Most dedicated software testing courses are offered at the master's level and are elective, whereas testing is introduced earlier in less specialized (NST) courses. There is low adoption of formal guidelines (e.g., ACM, SWEBOK), limited integration of non-functional testing types, and a high diversity in textbooks and tools used. While modern practices like Test-Driven Development and automated assessment are increasingly adopted, gamification and active learning approaches remain underutilized. Teachers expressed a need for improved and more consistent teaching materials. Conclusion. The study highlights a mismatch between academic practices and industry expectations in software testing education. Greater integration of standardized curricula, broader adoption of modern teaching tools, and increased support for teachers through high-quality, adaptable teaching materials are needed to enhance the effectiveness of software testing education.

CloseRead Abstract

2026

Specification-Guided Repair of Arithmetic Errors in Dafny Programs Using LLMs

Authors
Wu, V; Mendes, A; Abreu, A;

Publication
SOFTWARE ENGINEERING AND FORMAL METHODS, SEFM 2025

Abstract
Debugging and repairing faults when programs fail to formally verify can be complex and time-consuming. Automated Program Repair (APR) can ease this burden by automatically identifying and fixing faults. However, traditional APR techniques often rely on test suites for validation, but these may not capture all possible scenarios. In contrast, formal specifications provide strong correctness criteria, enabling more effective automated repair. In this paper, we present an APR tool for Dafny, a verification-aware programming language that uses formal specifications - including preconditions, post-conditions, and invariants - as oracles for fault localization and repair. Assuming the correctness of the specifications and focusing on arithmetic bugs, we localize faults through a series of steps, which include using Hoare logic to determine the state of each statement within the program, and applying Large Language Models (LLMs) to synthesize candidate fixes. The models considered are GPT-4o mini, Llama 3, Mistral 7B, and Llemma 7B. We evaluate our approach using DafnyBench, a benchmark of realworld Dafny programs. Our tool achieves 89.7% fault localization success rate and GPT-4o mini yields the highest repair success rate of 74.18%. These results highlight the potential of combining formal reasoning with LLM-based program synthesis for automated program repair.

CloseRead Abstract

2026

Can Large Language Models Help Students Prove Software Correctness? An Experimental Study with Dafny

Authors
Carreira, C; Silva, A; Abreu, A; Mendes, A;

Publication
SOFTWARE ENGINEERING AND FORMAL METHODS, SEFM 2025

Abstract
Students in computing education increasingly use large language models (LLMs) such as ChatGPT. Yet, the role of LLMs in supporting cognitively demanding tasks, like deductive program verification, remains poorly understood. This paper investigates how students interact with an LLM when solving formal verification exercises in Dafny, a language that supports functional correctness by allowing programmers to write formal specifications and automatically verifying that the implementation satisfies the specification. We conducted a mixed-methods study with master's students enrolled in a formal methods course. Each participant completed two verification problems, one with access to a custom ChatGPT interface that logged all interactions and the other without. We identified strategies used by successful students and assessed the level of trust students place in LLMs. Our findings show that students perform significantly better when using ChatGPT; however, performance gains are tied to prompt quality. We conclude with practical recommendations for integrating LLMs into formal methods courses more effectively, including designing LLM-aware challenges that promote learning.

CloseRead Abstract

2026

Are Users More Willing to Use Formally Verified Password Managers?

Authors
Carreira, C; Ferreira, JF; Mendes, A; Christin, N;

Publication
SOFTWARE ENGINEERING AND FORMAL METHODS, SEFM 2025

Abstract
Formal verification has recently been increasingly used to prove the correctness and security of many applications. It is attractive because it can prove the absence of errors with the same certainty as mathematicians proving theorems. However, while most security experts recognize the value of formal verification, the views of non-technical users on this topic are unknown. We designed and implemented two experiments to address this issue to understand how formal verification impacts users. Our approach started with a formative study involving 15 participants, followed by the main quantitative study with 200 individuals. We focus on the application domain of Password Managers (PMs) since it has been documented that the lack of trust in PMs might lead to lower adoption. Moreover, recent efforts have focused on formally verifying (parts of) PMs. We conclude that formal verification is seen as desirable by users and identify three actionable recommendations to improve formal verification communication efforts.

CloseRead Abstract