Publications

Publications by CRACS

2025

On Exploring Safe Memory Reclamation Methods with a Simplified Lock-Free Hash Map Design

Authors
Moreno, P; Areias, M; Rocha, R;

Publication
EURO-PAR 2024: PARALLEL PROCESSING WORKSHOPS, PT II

Abstract
Lock-freedom offers significant advantages in terms of algorithm design, performance and scalability. A fundamental building block in software development is the usage of hash map data structures. This work extends a previous lock-free hash map to support a new simplified design that is able to take advantage of most state-of-the-art safe memory reclamation methods, thus outperforming the previous design.

CloseRead Abstract

2025

On Bridging Prolog and Python to Enhance an Inductive Logic Programming System

Authors
Costa, VS; Areias, M;

Publication
PRACTICAL ASPECTS OF DECLARATIVE LANGUAGES, PADL 2025

Abstract
Prolog is a programming language that provides a high-level approach to software development. Python is a versatile programming language that has a vast range of libraries including support for data analysis and machine learning tasks. We present a Prolog-Python interface that aims at exploiting Prolog deduction capabilities and Python's extensive libraries. Our novel interface was built using a divide and conquer methodology. In a first step, we implemented a set of C++ classes that can be matched to Python classes; next, we used an interface generator to export the relevant classes. Finally, we use C code to actually convert between the two realms. In order to demonstrate the usefulness of the interface, we enhance an Inductive Logic Programming System with a visualization capabilities and show how to interface with a standard classifier.

CloseRead Abstract

2025

Program Synthesis Using Inductive Logic Programming for the Abstraction and Reasoning Corpus

Authors
Rocha, FM; Dutra, I; Costa, VS;

Publication
INTELLIGENZA ARTIFICIALE

Abstract
The Abstraction and Reasoning Corpus (ARC-AGI) is an Artificial General Intelligence benchmark that is currently unsolved. It demands strong generalization and reasoning capabilities, which are known to be weaknesses of Neural Network based systems. In this work, we propose a Program synthesis system to solve it, which casts an ARC-AGI task as a sequence of Inductive Logic Programming tasks. We have implemented a simple Domain Specific Language that corresponds to a small set of object-centric abstractions relevant to the benchmark. This allows for adequate representations to be used to create logic programs, which provide reasoning capabilities to our system. When solving each task, the proposed system can generalize from a few training pairs of input-output grids. The obtained logic programs are able to generate objects present in the output grids and can transform the test input grid into the output grid solution. We developed our system based on some ARC-AGI tasks that do not require more than the small number of primitives that we implemented and showed that our system can solve unseen tasks that require different reasoning.

CloseRead Abstract

2025

Regular Typed Unification

Authors
Barbosa, J; Florido, M; Costa, VS;

Publication
ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE

Abstract
Here we define a new unification algorithm for terms interpreted in semantic domains denoted by a subclass of regular types here called deterministic regular types. This reflects our intention not to handle the semantic universe as a homogeneous collection of values, but instead, to partition it in a way that is similar to data types in programming languages. We first define the new unification algorithm which is based on constraint generation and constraint solving, and then prove its main properties: termination, soundness, and completeness with respect to the semantics. Finally, we discuss how to apply this algorithm to a dynamically typed version of Prolog.

CloseRead Abstract

2025

Distance-based feature selection using Benford's law for malware detection

Authors
Fernandes, P; Ciardhuáin, SO; Antunes, M;

Publication
COMPUTERS & SECURITY

Abstract
Detecting malware in computer networks and data streams from Android devices remains a critical challenge for cybersecurity researchers. While machine learning and deep learning techniques have shown promising results, these approaches often require large volumes of labelled data, offer limited interpretability, and struggle to adapt to sophisticated threats such as zero-day attacks. Moreover, their high computational requirements restrict their applicability in resource-constrained environments. This research proposes an innovative approach that advances the state of the art by offering practical solutions for dynamic and data-limited security scenarios. By integrating natural statistical laws, particularly Benford's law, with dissimilarity functions, a lightweight, fast, and scalable model is developed that eliminates the need for extensive training and large labelled datasets while improving resilience to data imbalance and scalability for large-scale cybersecurity applications. Although Benford's law has demonstrated potential in anomaly detection, its effectiveness is limited by the difficulty of selecting relevant features. To overcome this, the study combines Benford's law with several distance functions, including Median Absolute Deviation, Kullback-Leibler divergence, Euclidean distance, and Pearson correlation, enabling statistically grounded feature selection. Additional metrics, such as the Kolmogorov test, Jensen-Shannon divergence, and Z statistics, were used for model validation. This approach quantifies discrepancies between expected and observed distributions, addressing classic feature selection challenges like redundancy and imbalance. Validated on both balanced and unbalanced datasets, the model achieved strong results: 88.30% accuracy and 85.08% F1-score in the balanced set, 92.75% accuracy and 95.29% F1-score in the unbalanced set. The integration of Benford's law with distance functions significantly reduced false positives and negatives. Compared to traditional Machine Learning methods, which typically require extensive training and large datasets to achieve F1 scores between 92% and 99%, the proposed approach delivers competitive performance while enhancing computational efficiency, robustness, and interpretability. This balance makes it a practical and scalable alternative for real-time or resource-constrained cybersecurity environments.

CloseRead Abstract

2025

Multi-Class Intrusion Detection in Internet of Vehicles: Optimizing Machine Learning Models on Imbalanced Data

Authors
Palma, A; Antunes, M; Bernardino, J; Alves, A;

Publication
FUTURE INTERNET

Abstract
The Internet of Vehicles (IoV) presents complex cybersecurity challenges, particularly against Denial-of-Service (DoS) and spoofing attacks targeting the Controller Area Network (CAN) bus. This study leverages the CICIoV2024 dataset, comprising six distinct classes of benign traffic and various types of attacks, to evaluate advanced machine learning techniques for instrusion detection systems (IDS). The models XGBoost, Random Forest, AdaBoost, Extra Trees, Logistic Regression, and Deep Neural Network were tested under realistic, imbalanced data conditions, ensuring that the evaluation reflects real-world scenarios where benign traffic dominates. Using hyperparameter optimization with Optuna, we achieved significant improvements in detection accuracy and robustness. Ensemble methods such as XGBoost and Random Forest consistently demonstrated superior performance, achieving perfect accuracy and macro-average F1-scores, even when detecting minority attack classes, in contrast to previous results for the CICIoV2024 dataset. The integration of optimized hyperparameter tuning and a broader methodological scope culminated in an IDS framework capable of addressing diverse attack scenarios with exceptional precision.

CloseRead Abstract