Publicacoes - INESC TEC

Publicações

Publicações por Rita Paula Ribeiro

2026

Machine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2025, Porto, Portugal, September 15-19, 2025, Proceedings, Part I

Autores
Ribeiro, RP; Pfahringer, B; Japkowicz, N; Larrañaga, P; Jorge, AM; Soares, C; Abreu, PH; Gama, J;

Publicação
ECML/PKDD (1)

Abstract

2025

Network-based Anomaly Detection in Waste Transportation Data with Limited Supervision

Autores
Shaji, N; Tabassum, S; Ribeiro, RP; Gama, J; Gorgulho, J; Garcia, A; Santana, P;

Publicação
APPLIED NETWORK SCIENCE

Abstract
Detecting anomalies in Waste transportation networks is vital for uncovering illegal or unsafe activities, that can have serious environmental and regulatory consequences. Identifying anomalies in such networks presents a significant challenge due to the limited availability of labeled data and the subtle nature of illicit activities. Moreover, traditional anomaly detection methods relying solely on individual transaction data may overlook deeper, network-level irregularities that arise from complex interactions between entities, especially in the absence of labeled data. This study explores anomaly detection in a waste transport network using unsupervised learning, enhanced by limited supervision and enriched with network structure information. Initially, unsupervised models like Isolation Forest, K-Means, LOF, and Autoencoders were applied using statistical and graph-based features. These models detected outliers without prior labels. Later, information on a few confirmed anomalous users enabled weak supervision, guiding feature selection through statistical tests like Kolmogorov-Smirnov and Anderson-Darling. Results show that models trained on a reduced, graph-focused feature set improved anomaly detection, particularly under extreme class imbalance. Isolation Forest notably ranked known anomalies highly. Ego network visualizations supported these findings, demonstrating the value of integrating structural features and limited labels for identifying subtle, relational anomalies.

FecharLer Abstract