Publications

Publications by Inês Dutra

2026

Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track and Demo Track - European Conference, ECML PKDD 2025, Porto, Portugal, September 15-19, 2025, Proceedings, Part X

Authors
Dutra, I; Pechenizkiy, M; Cortez, P; Pashami, S; Pasquali, A; Moniz, N; Jorge, AM; Soares, C; Abreu, PH; Gama, J;

Publication
ECML/PKDD (10)

Abstract

2025

A Risk Manager for Intrusion Tolerant Systems: Enhancing HAL 9000 With New Scoring and Data Sources

Authors
Freitas, T; Novo, C; Dutra, I; Soares, J; Correia, ME; Shariati, B; Martins, R;

Publication
SOFTWARE-PRACTICE & EXPERIENCE

Abstract
Background Intrusion Tolerant Systems (ITS) aim to maintain system security despite adversarial presence by limiting the impact of successful attacks. Current ITS risk managers rely heavily on public databases like NVD and Exploit DB, which suffer from long delays in vulnerability evaluation, reducing system responsiveness.Objective This work extends the HAL 9000 Risk Manager to integrate additional real-time threat intelligence sources and employ machine learning techniques to automatically predict and reassess vulnerability risk scores, addressing limitations of existing solutions.Methods A custom-built scraper collects diverse cybersecurity data from multiple Open Source Intelligence (OSINT) platforms, such as NVD, CVE, AlienVault OTX, and OSV. HAL 9000 uses machine learning models for CVE score prediction, vulnerability clustering through scalable algorithms, and reassessment incorporating exploit likelihood and patch availability to dynamically evaluate system configurations.Results Integration of newly scraped data significantly enhances the risk management capabilities, enabling faster detection and mitigation of emerging vulnerabilities with improved resilience and security. Experiments show HAL 9000 provides lower risk and more resilient configurations compared to prior methods while maintaining scalability and automation.Conclusions The proposed enhancements position HAL 9000 as a next-generation autonomous Risk Manager capable of effectively incorporating diverse intelligence sources and machine learning to improve ITS security posture in dynamic threat environments. Future work includes expanding data sources, addressing misinformation risks, and real-world deployments.

CloseRead Abstract

2025

Anomaly Detection and Root Cause Analysis in Cloud-Native Environments Using Large Language Models and Bayesian Networks

Authors
Pedroso, DF; Almeida, L; Pulcinelli, LEG; Aisawa, WAA; Dutra, I; Bruschi, SM;

Publication
IEEE ACCESS

Abstract
Cloud computing technologies offer significant advantages in scalability and performance, enabling rapid deployment of applications. The adoption of microservices-oriented architectures has introduced an ecosystem characterized by an increased number of applications, frameworks, abstraction layers, orchestrators, and hypervisors, all operating within distributed systems. This complexity results in the generation of vast quantities of logs from diverse sources, making the analysis of these events an inherently challenging task, particularly in the absence of automation. To address this issue, Machine Learning techniques leveraging Large Language Models (LLMs) offer a promising approach for dynamically identifying patterns within these events. In this study, we propose a novel anomaly detection framework utilizing a microservices architecture deployed on Kubernetes and Istio, enhanced by an LLM model. The model was trained on various error scenarios, with Chaos Mesh employed as an error injection tool to simulate faults of different natures, and Locust used as a load generator to create workload stress conditions. After an anomaly is detected by the LLM model, we employ a dynamic Bayesian network to provide probabilistic inferences about the incident, proving the relationships between components and assessing the degree of impact among them. Additionally, a ChatBot powered by the same LLM model allows users to interact with the AI, ask questions about the detected incident, and gain deeper insights. The experimental results demonstrated the model's effectiveness, reliably identifying all error events across various test scenarios. While it successfully avoided missing any anomalies, it did produce some false positives, which remain within acceptable limits.

CloseRead Abstract

2014

ExpertBayes: Automatically refining manually built Bayesian networks

Authors
Almeida E.; Ferreira P.; Vinhoza T.T.V.; Dutra I.; Borges P.; Wu Y.; Burnside E.;

Publication
Proceedings - 2014 13th International Conference on Machine Learning and Applications, ICMLA 2014

Abstract
Bayesian network structures are usually built using only the data and starting from an empty network or from a naive Bayes structure. Very often, in some domains, like medicine, a prior structure is already known based on expert knowledge. This structure can be automatically or manually refined in search for better performance models. In this work, we take Bayesian networks built by specialists and show that minor perturbations to this original network can yield better classifiers, while maintaining most of the interpretability of the original network.

CloseRead Abstract

2024

Instance-wise Uncertainty for Class Imbalance in Semantic Segmentation

Authors
Almeida, L; Dutra, I; Renna, F;

Publication
CoRR

Abstract

2025

Program Synthesis Using Inductive Logic Programming for the Abstraction and Reasoning Corpus

Authors
Rocha, FM; Dutra, I; Costa, VS;

Publication
INTELLIGENZA ARTIFICIALE

Abstract
The Abstraction and Reasoning Corpus (ARC-AGI) is an Artificial General Intelligence benchmark that is currently unsolved. It demands strong generalization and reasoning capabilities, which are known to be weaknesses of Neural Network based systems. In this work, we propose a Program synthesis system to solve it, which casts an ARC-AGI task as a sequence of Inductive Logic Programming tasks. We have implemented a simple Domain Specific Language that corresponds to a small set of object-centric abstractions relevant to the benchmark. This allows for adequate representations to be used to create logic programs, which provide reasoning capabilities to our system. When solving each task, the proposed system can generalize from a few training pairs of input-output grids. The obtained logic programs are able to generate objects present in the output grids and can transform the test input grid into the output grid solution. We developed our system based on some ARC-AGI tasks that do not require more than the small number of primitives that we implemented and showed that our system can solve unseen tasks that require different reasoning.

CloseRead Abstract