Publicacoes - INESC TEC

Publicações

Publicações por CRACS

2025

Anomaly Detection and Root Cause Analysis in Cloud-Native Environments Using Large Language Models and Bayesian Networks

Autores
Pedroso, DF; Almeida, L; Pulcinelli, LEG; Aisawa, WAA; Dutra, I; Bruschi, SM;

Publicação
IEEE ACCESS

Abstract
Cloud computing technologies offer significant advantages in scalability and performance, enabling rapid deployment of applications. The adoption of microservices-oriented architectures has introduced an ecosystem characterized by an increased number of applications, frameworks, abstraction layers, orchestrators, and hypervisors, all operating within distributed systems. This complexity results in the generation of vast quantities of logs from diverse sources, making the analysis of these events an inherently challenging task, particularly in the absence of automation. To address this issue, Machine Learning techniques leveraging Large Language Models (LLMs) offer a promising approach for dynamically identifying patterns within these events. In this study, we propose a novel anomaly detection framework utilizing a microservices architecture deployed on Kubernetes and Istio, enhanced by an LLM model. The model was trained on various error scenarios, with Chaos Mesh employed as an error injection tool to simulate faults of different natures, and Locust used as a load generator to create workload stress conditions. After an anomaly is detected by the LLM model, we employ a dynamic Bayesian network to provide probabilistic inferences about the incident, proving the relationships between components and assessing the degree of impact among them. Additionally, a ChatBot powered by the same LLM model allows users to interact with the AI, ask questions about the detected incident, and gain deeper insights. The experimental results demonstrated the model's effectiveness, reliably identifying all error events across various test scenarios. While it successfully avoided missing any anomalies, it did produce some false positives, which remain within acceptable limits.

FecharLer Abstract

2025

EVSOAR: Security Orchestration, Automation and Response via EV Charging Stations

Autores
Freitas, T; Silva, E; Yasmin, R; Shoker, A; Correia, ME; Martins, R; Esteves Veríssimo, PJ;

Publicação
101st IEEE Vehicular Technology Conference, VTC Spring 2025, Oslo, Norway, June 17-20, 2025

Abstract
Vehicle cybersecurity has emerged as a critical concern, driven by innovation in the automotive industry, e.g., autonomous, electric, or connected vehicles. Current efforts to address these challenges are constrained by the limited computational resources of vehicles and the reliance on connected infrastructures. This motivated the foundation of Vehicle Security Operations Centers (VSOCs) that extend IT-based Security Operations Centers (SOCs) to cover the entire automotive ecosystem, both the in-vehicle and off-vehicle scopes. Security Orchestration, Automation, and Response (SOAR) tools are considered key for implementing an effective cybersecurity solution. However, existing state-of-the-art solutions depend on infrastructure networks such as 4G, 5G, and WiFi, which often face scalability and congestion issues. To address these limitations, we propose a novel SOAR architecture EVSOAR that leverages the EV charging stations for connectivity and computing to enhance vehicle cybersecurity. Our EV-specific SOAR architecture enables real-time analysis and automated responses to cybersecurity threats closer to the EV, reducing cellular latency, bandwidth, and interference limitations. Our experimental results demonstrate a significant improvement in latency, stability, and scalability through the infrastructure and the capacity to deploy computationally intensive applications that are otherwise infeasible within the resource constraints of individual vehicles.

FecharLer Abstract

2025

Evaluating Transfer Learning Methods on Real-World Data Streams: A Case Study in Financial Fraud Detection

Autores
Pereira, RR; Bono, J; Ferreira, HM; Ribeiro, P; Soares, C; Bizarro, P;

Publicação
ECML/PKDD (9)

Abstract
When the available data for a target domain is limited, transfer learning (TL) methods leverage related data-rich source domains to train and evaluate models, before deploying them on the target domain. However, most TL methods assume fixed levels of labeled and unlabeled target data, which contrasts with real-world scenarios where both data and labels arrive progressively over time. As a result, evaluations based on these static assumptions may not reflect how methods perform in practice. To support a more realistic assessment of TL methods in dynamic settings, we propose an evaluation framework that (1) simulates varying data availability over time, (2) creates multiple domains via resampling of a given dataset and (3) introduces inter-domain variability through controlled transformations, e.g., including time-dependent covariate and concept shifts. These capabilities enable the systematic simulation of a large number of variants of the experiments, providing deeper insights into how algorithms may behave when deployed. We demonstrate the usefulness of the proposed framework by performing a case study on a proprietary real-world suite of card payment datasets. To support reproducibility, we also apply the framework on the publicly available Bank Account Fraud (BAF) dataset. By providing a methodology for evaluating TL methods over time and in different data availability conditions, our framework supports a better understanding of model behavior in real-world environments, which enables more informed decisions when deploying models in new domains.

FecharLer Abstract

2025

Studying and Improving Graph Neural Network-based Motif Estimation

Autores
Vieira, PC; Silva, MEP; Pinto Ribeiro, PM;

Publicação
CoRR

Abstract

2025

Next Higher Point: Two Novel Approaches for Computing Natural Visibility Graphs

Autores
Daniel, P; Silva, VF; Ribeiro, P;

Publicação
COMPLEX NETWORKS & THEIR APPLICATIONS XIII, COMPLEX NETWORKS 2024, VOL 1

Abstract
With the huge amount of data that has been collected over time, many methods are being developed to allow better understanding and forecasting in several domains. Time series analysis is a powerful tool to achieve this goal. Despite being a well-established area, there are some gaps, and new methods are emerging to overcome these limitations, such as visibility graphs. Visibility graphs allow the analyses of times series as complex networks and make possible the use of more advanced techniques from another well-established area, network science. In this paper, we present two new efficient approaches for computing natural visibility graphs from times series, one for online scenarios in.O(n log n) and the other for offline scenarios in.O(nm), the latter taking advantage of the number of different values in the time series (m).

FecharLer Abstract

2025

Large Language Model Framework for Log Sequence Anomaly Detection

Autores
Reis, J; Areias, M; Barbosa, JG;

Publicação
Progress in Artificial Intelligence - 24th EPIA Conference on Artificial Intelligence, EPIA 2025, Faro, Portugal, October 1-3, 2025, Proceedings, Part I

Abstract
Log analysis is fundamental to modern software observability systems, playing a key role in improving system reliability. Recently, there has been a growing adoption of Large Language Models (LLMs) for log anomaly detection, due to their ability to learn complex patterns. In this work, we propose a model-agnostic framework that allows seamless plug-and-play integration of different LLMs, making it easy to experiment with and select the model that fits specific needs. These models are first fine-tuned on normal log data, learning their patterns. During inference, the model predicts the most probable next tokens based on the preceding context in each sequence. Anomaly detection is performed using Top-K predictions, where sequences are flagged as anomalous if the actual log entry does not appear among the K most probable next tokens, with K determined using the validation dataset. The proposed framework is evaluated on three widely-used benchmark datasets—HDFS, BGL, and Thunderbird—where it consistently achieves competitive results, outperforming state-of-the-art methods in multiple scenarios. These results highlight the effectiveness of LLM-based log analysis and the importance of flexibility when selecting models for specific operational contexts. © 2025 Elsevier B.V., All rights reserved.

FecharLer Abstract