Publications

Publications by HASLab

2025

Specification-Guided Repair of Arithmetic Errors in Dafny Programs using LLMs

Authors
Wu, V; Mendes, A; Abreu, A;

Publication
CoRR

Abstract
Debugging and repairing faults when programs fail to formally verify can be complex and time-consuming. Automated Program Repair (APR) can ease this burden by automatically identifying and fixing faults. However, traditional APR techniques often rely on test suites for validation, but these may not capture all possible scenarios. In contrast, formal specifications provide strong correctness criteria, enabling more effective automated repair. In this paper, we present an APR tool for Dafny, a verification-aware programming language that uses formal specifications — including pre-conditions, post-conditions, and invariants — as oracles for fault localization and repair. Assuming the correctness of the specifications and focusing on arithmetic bugs, we localize faults through a series of steps, which include using Hoare logic to determine the state of each statement within the program, and applying Large Language Models (LLMs) to synthesize candidate fixes. The models considered are GPT-4o mini, Llama 3, Mistral 7B, and Llemma 7B. We evaluate our approach using DafnyBench, a benchmark of real-world Dafny programs. Our tool achieves 89.7% fault localization success rate and GPT-4o mini yields the highest repair success rate of 74.18%. These results highlight the potential of combining formal reasoning with LLM-based program synthesis for automated program repair. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.

CloseRead Abstract

2025

Survey about Teachers' Perspective on Software Testing Education

Authors
Tramontana, P; Marín, B; Paiva, ACR; Mendes, A; Vos, TEJ; Cammaerts, F; Snoeck, M; Saadatmand, M; Fasolino, AR;

Publication

Abstract

2025

What Challenges Do Developers Face When Using Verification-Aware Programming Languages?

Authors
Oliveira, F; Mendes, A; Carreira, C;

Publication
CoRR

Abstract

2025

Can Large Language Models Help Students Prove Software Correctness? An Experimental Study with Dafny

Authors
Carreira, C; Silva, AF; Abreu, A; Mendes, A;

Publication
CoRR

Abstract

2025

Detecting Resource Leaks on Android with Alpakka

Authors
Santos, G; Bispo, J; Mendes, A;

Publication
PROCEEDINGS OF SLE 2025 18TH ACM SIGPLAN INTERNATIONAL CONFERENCE ON SOFTWARE LANGUAGE ENGINEERING, SLE 2025

Abstract
Mobile devices have become integral to our everyday lives, yet their utility hinges on their battery life. In Android apps, resource leaks caused by inefficient resource management are a significant contributor to battery drain and poor user experience. Our work introduces Alpakka, a source-to-source compiler for Android's Smali syntax. To showcase Alpakka's capabilities, we developed an Alpakka library capable of detecting and automatically correcting resource leaks in Android APK files. We demonstrate Alpakka's effectiveness through empirical testing on 124 APK files from 31 real-world Android apps in the DroidLeaks [12] dataset. In our analysis, Alpakka identified 93 unique resource leaks, of which we estimate 15% are false positives. From these, we successfully applied automatic corrections to 45 of the detected resource leaks.

CloseRead Abstract

2025

From "Worse is Better" to Better: Lessons from a Mixed Methods Study of Ansible's Challenges

Authors
Carreira, C; Saavedra, N; Mendes, A; Ferreira, JF;

Publication
CoRR

Abstract