2025
Authors
Carreira, C; Silva, AF; Abreu, A; Mendes, A;
Publication
SEFM
Abstract
Students in computing education increasingly use large language models (LLMs) such as ChatGPT. Yet, the role of LLMs in supporting cognitively demanding tasks, like deductive program verification, remains poorly understood. This paper investigates how students interact with an LLM when solving formal verification exercises in Dafny, a language that supports functional correctness by allowing programmers to write formal specifications and automatically verifying that the implementation satisfies the specification. We conducted a mixed-methods study with master’s students enrolled in a formal methods course. Each participant completed two verification problems, one with access to a custom ChatGPT interface that logged all interactions and the other without. We identified strategies used by successful students and assessed the level of trust students place in LLMs. Our findings show that students perform significantly better when using ChatGPT; however, performance gains are tied to prompt quality. We conclude with practical recommendations for integrating LLMs into formal methods courses more effectively, including designing LLM-aware challenges that promote learning.
2025
Authors
Santos, G; Bispo, J; Mendes, A;
Publication
PROCEEDINGS OF SLE 2025 18TH ACM SIGPLAN INTERNATIONAL CONFERENCE ON SOFTWARE LANGUAGE ENGINEERING, SLE 2025
Abstract
Mobile devices have become integral to our everyday lives, yet their utility hinges on their battery life. In Android apps, resource leaks caused by inefficient resource management are a significant contributor to battery drain and poor user experience. Our work introduces Alpakka, a source-to-source compiler for Android's Smali syntax. To showcase Alpakka's capabilities, we developed an Alpakka library capable of detecting and automatically correcting resource leaks in Android APK files. We demonstrate Alpakka's effectiveness through empirical testing on 124 APK files from 31 real-world Android apps in the DroidLeaks [12] dataset. In our analysis, Alpakka identified 93 unique resource leaks, of which we estimate 15% are false positives. From these, we successfully applied automatic corrections to 45 of the detected resource leaks.
2025
Authors
Carreira, C; Saavedra, N; Mendes, A; Ferreira, JF;
Publication
CoRR
Abstract
2025
Authors
Carreira, C; Ferreira, JF; Mendes, A; Christin, N;
Publication
SEFM
Abstract
Formal verification has recently been increasingly used to prove the correctness and security of many applications. It is attractive because it can prove the absence of errors with the same certainty as mathematicians proving theorems. However, while most security experts recognize the value of formal verification, the views of non-technical users on this topic are unknown. We designed and implemented two experiments to address this issue to understand how formal verification impacts users. Our approach started with a formative study involving 15 participants, followed by the main quantitative study with 200 individuals. We focus on the application domain of Password Managers (PMs) since it has been documented that the lack of trust in PMs might lead to lower adoption. Moreover, recent efforts have focused on formally verifying (parts of) PMs. We conclude that formal verification is seen as desirable by users and identify three actionable recommendations to improve formal verification communication efforts.
2025
Authors
Carreira, C; Mendes, A; Ferreira, JF; Christin, N;
Publication
CoRR
Abstract
2025
Authors
Saavedra, N; Ferreira, JF; Mendes, A;
Publication
ISSTA Companion
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.