2025
Authors
Vaz, B; Figueira, A;
Publication
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS
Abstract
This article focuses on the creation and evaluation of synthetic data to address the challenges of imbalanced datasets in machine learning (ML) applications, using fake news detection as a case study. We conducted a thorough literature review on generative adversarial networks (GANs) for tabular data, synthetic data generation methods, and synthetic data quality assessment. By augmenting a public news dataset with synthetic data generated by different GAN architectures, we demonstrate the potential of synthetic data to improve ML models' performance in fake news detection. Our results show a significant improvement in classification performance, especially in the underrepresented class. We also modify and extend a data usage approach to evaluate the quality of synthetic data and investigate the relationship between synthetic data quality and data augmentation performance in classification tasks. We found a positive correlation between synthetic data quality and performance in the underrepresented class, highlighting the importance of high-quality synthetic data for effective data augmentation.
2025
Authors
Rocha, B; Figueira, A;
Publication
INFORMATICS-BASEL
Abstract
In today's competitive higher education sector, institutions increasingly rely on international rankings to secure financial resources, attract top-tier talent, and elevate their global reputation. Simultaneously, these universities have expanded their presence on social media, utilizing sophisticated posting strategies to disseminate information and boost recognition and engagement. This study examines the relationship between higher education institutions' (HEIs') rankings and their social media posting strategies. We gathered and analyzed publications from 18 HEIs featured in a consolidated ranking system, examining various features of their social media posts. To better understand these strategies, we categorized the posts into five predefined topics-engagement, research, image, society, and education. This categorization, combined with Long Short-Term Memory (LSTM) and a Random Forest (RF) algorithm, was utilized to predict social media output in the last five days of each month, achieving successful results. This paper further explores how variations in these social media strategies correlate with the rankings of HEIs. Our findings suggest a nuanced interaction between social media engagement and the perceived prestige of HEIs.
2025
Authors
Paiva, JC; Leal, JP; Figueira, A;
Publication
ELECTRONICS
Abstract
Automated assessment tools for programming assignments have become increasingly popular in computing education. These tools offer a cost-effective and highly available way to provide timely and consistent feedback to students. However, when evaluating a logically incorrect source code, there are some reasonable concerns about the formative gap in the feedback generated by such tools compared to that of human teaching assistants. A teaching assistant either pinpoints logical errors, describes how the program fails to perform the proposed task, or suggests possible ways to fix mistakes without revealing the correct code. On the other hand, automated assessment tools typically return a measure of the program's correctness, possibly backed by failing test cases and, only in a few cases, fixes to the program. In this paper, we introduce a tool, AsanasAssist, to generate formative feedback messages to students to repair functionality mistakes in the submitted source code based on the most similar algorithmic strategy solution. These suggestions are delivered with incremental levels of detail according to the student's needs, from identifying the block containing the error to displaying the correct source code. Furthermore, we evaluate how well the automatically generated messages provided by AsanasAssist match those provided by a human teaching assistant. The results demonstrate that the tool achieves feedback comparable to that of a human grader while being able to provide it just in time.
2025
Authors
Silva, VF; Silva, ME; Ribeiro, P; Silva, F;
Publication
DATA MINING AND KNOWLEDGE DISCOVERY
Abstract
Multivariate time series analysis is a vital but challenging task, with multidisciplinary applicability, tackling the characterization of multiple interconnected variables over time and their dependencies. Traditional methodologies often adapt univariate approaches or rely on assumptions specific to certain domains or problems, presenting limitations. A recent promising alternative is to map multivariate time series into high-level network structures such as multiplex networks, with past work relying on connecting successive time series components with interconnections between contemporary timestamps. In this work, we first define a novel cross-horizontal visibility mapping between lagged timestamps of different time series and then introduce the concept of multilayer horizontal visibility graphs. This allows describing cross-dimension dependencies via inter-layer edges, leveraging the entire structure of multilayer networks. To this end, a novel parameter-free topological measure is proposed and common measures are extended for the multilayer setting. Our approach is general and applicable to any kind of multivariate time series data. We provide an extensive experimental evaluation with both synthetic and real-world datasets. We first explore the proposed methodology and the data properties highlighted by each measure, showing that inter-layer edges based on cross-horizontal visibility preserve more information than previous mappings, while also complementing the information captured by commonly used intra-layer edges. We then illustrate the applicability and validity of our approach in multivariate time series mining tasks, showcasing its potential for enhanced data analysis and insights.
2025
Authors
Patrick Daniel; Vanessa Freitas Silva; Pedro Ribeiro;
Publication
Complex Networks & Their Applications XIII
Abstract
2025
Authors
Santos Costa, VMdM; Areias, M;
Publication
Practical Aspects of Declarative Languages - 27th International Symposium, PADL 2025, Denver, CO, USA, January 20-21, 2025, Proceedings
Abstract
Prolog is a programming language that provides a high-level approach to software development. Python is a versatile programming language that has a vast range of libraries including support for data analysis and machine learning tasks. We present a Prolog-Python interface that aims at exploiting Prolog deduction capabilities and Python’s extensive libraries. Our novel interface was built using a divide and conquer methodology. In a first step, we implemented a set of C++ classes that can be matched to Python classes; next, we used an interface generator to export the relevant classes. Finally, we use C code to actually convert between the two realms. In order to demonstrate the usefulness of the interface, we enhance an Inductive Logic Programming System with a visualization capabilities and show how to interface with a standard classifier. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.