Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by HumanISE

2025

Evaluating the Impact of Scaffolding and Visualizations for Mutation Testing Exercises in Software Engineering Education

Authors
Potter, H; Paiva, ACR; Amalfitano, D; Fasolino, AR; Tramontana, P; Just, R;

Publication
COMPANION PROCEEDINGS OF THE 33RD ACM INTERNATIONAL CONFERENCE ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, FSE COMPANION 2025

Abstract
Mutation testing is an effective testing technique for improving how well a test suite can detect small changes to a program under test. This testing technique is seeing increased industry adoption. This paper aims to study the use of mutation testing in an educational setting and understand students' technical and conceptual challenges in applying mutation testing concepts. We report on two case studies of incorporating mutation testing into software engineering curricula. The Scaffolding Study explores the impact of using different mutation analysis tools directly or indirectly via a uniform interface provided by an educational infrastructure. We observe that scaffolding (indirect tool use) improved the consistency of student performance for those using the same mutation analysis tool on the same code as well as helping students perform more effective mutation testing. The Visualization Study explores the impact of different forms of output of a mutation analysis tool. Specifically, it assesses to what extent visualizations support students in reasoning about mutants and writing tests to detect them. We observe that like scaffolding, visualizations helped students perform more effective mutation testing, with lower-performing students seeing a boost in particular. We further explore challenges around automatic assessment of mutation testing exercises. For example, we observe that even with assignment scaffolding, 18-21% of student submissions required manual modifications to successfully execute.

2025

Code change and smell techniques for regression test selection

Authors
Mori, A; Paiva, ACR; Souza, SRS;

Publication
SOFTWARE QUALITY JOURNAL

Abstract
Regression testing is a selective retesting of a system or component to verify that modifications have not induced unintended effects and that the system or component maintains compliance with the specified requirements. However, it can be time-consuming and resource-intensive, especially for large systems. Regression testing selection techniques can help address this issue by selecting a subset of test cases to run. The Change Based technique selects a subset of the existing test cases and executes modified classes. Besides effectively reducing the test suite, this technique may reduce the capability of revealing faults. From this perspective, code smells are known to identify poor design and software quality issues. Some works have explored the association between smells and faults with some promising results. Inspired by these results, we propose combining code change and smell to select regression tests and present eight techniques. Additionally, we developed the Regression Testing Selection Tool (RTST) to automate the selection process using these techniques. We empirically evaluated the approach in Defects4J projects by comparing the techniques' effectiveness with the Change Based and Class Firewall as a baseline. The results show that the Change and Smell Intersection Based technique achieves the highest reduction rate in the test suite size but with less class coverage. On the other hand, Change and Smell Firewall technique achieves the lowest test suite size reduction with the highest fault detection effectiveness test cases, suggesting the combination of smells and changed classes can potentially find more bugs. The Smell Based technique provides a comparable class coverage to the code change and smell approach. Our findings indicate opportunities for improving the efficiency and effectiveness of regression testing and highlight that software quality should be a concern throughout the software evolution.

2025

Teachers’ Perspective on Software Testing Education

Authors
Fasolino, AR; Marin, B; Vos, TEJ; Mendes, A; Paiva, ACR; Cammaerts, F; Snoeck, M; Saadatmand, M; Tramontana, P;

Publication
ACM Transactions on Computing Education

Abstract
Context: Software testing is a critical aspect of the software development lifecycle, yet it remains underrepresented in academic curricula. Despite advances in pedagogical practices and increased attention from the academic community, challenges persist in effectively teaching software testing. Understanding these challenges from the teachers’ perspective is crucial to aligning education with industry needs. Objective: To analyze the characteristics, practices, tools, and challenges of software testing courses in higher education, from the perspective of educators, and to assess the integration of recent pedagogical approaches in software testing education. Method: A structured survey consisting of 52 questions was distributed to 143 software testing educators across Western European universities, resulting in 49 valid responses. The survey explored topics taught, course organization, teaching practices, tools and materials used, gamification approaches, and teacher satisfaction. Results: The survey revealed significant variability in course content, structure, and teaching methods. Most dedicated software testing courses are offered at the master’s level and are elective, whereas testing is introduced earlier in less specialized (NST) courses. There is low adoption of formal guidelines (e.g., ACM, SWEBOK), limited integration of non-functional testing types, and a high diversity in textbooks and tools used. While modern practices like Test-Driven Development and automated assessment are increasingly adopted, gamification and active learning approaches remain underutilized. Teachers expressed a need for improved and more consistent teaching materials. Conclusion: The study highlights a mismatch between academic practices and industry expectations in software testing education. Greater integration of standardized curricula, broader adoption of modern teaching tools, and increased support for teachers through high-quality, adaptable teaching materials are needed to enhance the effectiveness of software testing education.

2025

Can Llama 3 Accurately Assess Readability? A Comparative Study Using Lead Sections from Wikipedia

Authors
Rodrigues, JF; Cardoso, HL; Lopes, CT;

Publication
RESEARCH CHALLENGES IN INFORMATION SCIENCE, RCIS 2025, PT II

Abstract
Text readability is vital for effective communication and learning, especially for those with lower information literacy. This research aims to assess Llama 3's ability to grade readability and compare its alignment with established metrics. For that purpose, we create a new dataset of article lead sections from English and Simple English Wikipedia, covering nine categories. The model is prompted to rate the readability of the texts on a grade-level scale, and an in-depth analysis of the results is conducted. While Llama 3 correlates strongly with most metrics, it may underestimate text grade levels.

2025

Evaluating Llama 3 for Text Simplification: A Study on Wikipedia Lead Sections

Authors
Rodrigues, JF; Cardoso, HL; Lopes, CT;

Publication
COMPANION PROCEEDINGS OF THE ACM WEB CONFERENCE 2025, WWW COMPANION 2025

Abstract
Text simplification converts complex text into simpler language, improving readability and comprehension. This study evaluates the effectiveness of open-source large language models for text simplification across various categories. We created a dataset of 66,620 lead section pairs from English and Simple English Wikipedia, spanning nine categories, and tested Llama 3 for text simplification. We assessed its output for readability, simplicity, and meaning preservation. Results show improved readability, with simplification varying by category. Texts on Time were the most shortened, while Leisurerelated texts had the greatest reduction of words/characters and syllables per sentence. Meaning preservation was most effective for the Objects and Education categories.

2025

Cross-Lingual Entity Linking Using GPT Models in Radiology Abstracts

Authors
Dias, M; Lopes, CT;

Publication
RESEARCH CHALLENGES IN INFORMATION SCIENCE, RCIS 2025, PT II

Abstract
Entity linking is an important task in medical natural language processing (NLP) for converting unstructured text into structured data for clinical analysis and semantic interoperability. However, in lower-resource languages, this task is challenging due to the limited availability of domain-specific resources. This paper explores a translation-based cross-lingual entity linking approach using GPT models, GPT-3.5 and GPT-4o, for zero-shot machine translation and entity linking with in-context learning. We evaluate our approach using a Portuguese-English parallel dataset of radiology abstracts. Our results show that chunk-level machine translation outperforms sentence-level translation. Moreover, our translationbased approach to cross-lingual entity linking of UMLS concepts outperformed the multilingual encoder method baseline. However, the in-context learning entity linking approach did not outperform a translation-based approach with a dictionary-based entity linking method.

  • 17
  • 686