Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

2025

Reevaluating OSA severity: insights from AHI, Baveno classification, and respiratory events

Autores
Carvalho, M; Amorim, P; Rodrigues, PP; Ferreira-Santos, D;

Publicação
EUROPEAN RESPIRATORY JOURNAL

Abstract

2025

Benchmarking Time Series Feature Extraction for Algorithm Selection

Autores
dos Santos, MR; Cerqueira, V; Soares, C;

Publicação
EPIA (1)

Abstract
Effective selection of forecasting algorithms for time series data is a challenge in machine learning, impacting both predictive accuracy and efficiency. Metalearning, using features extracted from time series, offers a strategic approach to optimize algorithm selection. The utility of this approach depends on the amount of information the features contain about the behavior of the algorithms. Although there are several methods for systematic time series feature extraction, they have never been compared. This paper empirically analyzes the performance of each feature extraction method for algorithm selection and its impact on forecasting accuracy. Our study reveals that TSFRESH, TSFEATURES, and TSFEL exhibit comparable performance at algorithm selection accuracy, adeptly capturing time series characteristics essential for accurate algorithm selection. In contrast, Catch22 is found to be less effective for this purpose. In particular, TSFEL is identified as the most efficient method, balancing dimensionality and predictive performance. These findings provide insights for enhancing forecasting accuracy and efficiency through judicious selection of meta-feature extractors.

2025

Interpretable Predictive Maintenance: Combining Anomaly Detection with Quantitative Root Cause Analysis

Autores
Barbosa, I; Gama, J; Veloso, B;

Publicação
EPIA (2)

Abstract
Predictive Maintenance (PdM) aims to prevent failures through early detection, yet lacks explainability to support decision-making. Current PdM models often identify failures, but fail to explain their root causes, especially in real-world scenarios, with complex and limited labeled data. This study proposes an interpretable framework that combines LSTM-based Anomaly Detection with a dual-layered Root Cause Analysis (RCA) based on SHAP attributions. Applied to a real-world dataset, the method detects degradation transitions, tracks failure patterns over time, and provides interpretable information without explicit root cause labels.

2025

Phenotypic Characterization of Sleep Apnea Using Clusters Derived from Subject-Based SpO 2 Weighted Correlation Networks

Autores
Gomez-Pilar, J; Martín-Montero, A; Vaquerizo-Villar, F; Domínguez-Guerrero, M; Ferreira-Santos, D; Pereira-Rodrigues, P; Gozal, D; Hornero, R; Gutiérrez-Tobal, G;

Publicação
2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

Abstract

2025

Network-based Anomaly Detection in Waste Transportation Data with Limited Supervision

Autores
Shaji, N; Tabassum, S; Ribeiro, RP; Gama, J; Gorgulho, J; Garcia, A; Santana, P;

Publicação
APPLIED NETWORK SCIENCE

Abstract
Detecting anomalies in Waste transportation networks is vital for uncovering illegal or unsafe activities, that can have serious environmental and regulatory consequences. Identifying anomalies in such networks presents a significant challenge due to the limited availability of labeled data and the subtle nature of illicit activities. Moreover, traditional anomaly detection methods relying solely on individual transaction data may overlook deeper, network-level irregularities that arise from complex interactions between entities, especially in the absence of labeled data. This study explores anomaly detection in a waste transport network using unsupervised learning, enhanced by limited supervision and enriched with network structure information. Initially, unsupervised models like Isolation Forest, K-Means, LOF, and Autoencoders were applied using statistical and graph-based features. These models detected outliers without prior labels. Later, information on a few confirmed anomalous users enabled weak supervision, guiding feature selection through statistical tests like Kolmogorov-Smirnov and Anderson-Darling. Results show that models trained on a reduced, graph-focused feature set improved anomaly detection, particularly under extreme class imbalance. Isolation Forest notably ranked known anomalies highly. Ego network visualizations supported these findings, demonstrating the value of integrating structural features and limited labels for identifying subtle, relational anomalies.

2025

Salvador Urban Network Transportation (SUNT): A Landmark Spatiotemporal Dataset for Public Transportation

Autores
Ferreira, MV; Souza, M; Rios, TN; Fernandes, IFC; Nery, J; Gama, J; Bifet, A; Rios, RA;

Publicação
SCIENTIFIC DATA

Abstract
Efficient public transportation management is essential for the development of large urban centers, providing several benefits such as comprehensive coverage of population mobility, reduction of transport costs, better control of traffic congestion, and significant reduction of environmental impact limiting gas emissions and pollution. Realizing these benefits requires a deeply understanding the population and transit patterns and the adoption of approaches to model multiple relations and characteristics efficiently. This work addresses these challenges by providing a novel dataset that includes various public transportation components from three different systems: regular buses, subway, and BRT (Bus Rapid Transit). Our dataset comprises daily information from about 700,000 passengers in Salvador, one of Brazil's largest cities, and local public transportation data with approximately 2,000 vehicles operating across nearly 400 lines, connecting almost 3,000 stops and stations. With data collected from March 2024 to March 2025 at a frequency lower than one minute, SUNT stands as one of the largest, most comprehensive, and openly available urban datasets in the literature.

  • 166
  • 4387