Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por Bruno Miguel Veloso

2021

DEBACER: a method for slicing moderated debates

Autores
Ferraz, TP; Alcoforado, A; Bustos, E; Oliveira, AS; Gerber, R; Müller, N; d'Almeida, AC; Veloso, BM; Reali Costa, AH;

Publicação
CoRR

Abstract
Subjects change frequently in moderated debates with several participants, such as in parliamentary sessions, electoral debates, and trials. Partitioning a debate into blocks with the same subject is essential for understanding. Often a moderator is responsible for defining when a new block begins so that the task of automatically partitioning a moderated debate can focus solely on the moderator's behavior. In this paper, we (i) propose a new algorithm, DEBACER, which partitions moderated debates; (ii) carry out a comparative study between conventional and BERTimbau pipelines; and (iii) validate DEBACER applying it to the minutes of the Assembly of the Republic of Portugal. Our results show the effectiveness of DEBACER.

2022

MetroPT2: A Benchmark dataset for predictive maintenance

Autores
Veloso, B; Gama, J; Ribeiro, RP; Pereira, P;

Publicação

Abstract

2025

Modeling events and interactions through temporal processes: A survey

Autores
Liguori, A; Caroprese, L; Minici, M; Veloso, B; Spinnato, F; Nanni, M; Manco, G; Gama, J;

Publicação
NEUROCOMPUTING

Abstract
In real-world scenarios, numerous phenomena generate a series of events that occur in continuous time. Point processes provide a natural mathematical framework for modeling these event sequences. In this comprehensive survey, we aim to explore probabilistic models that capture the dynamics of event sequences through temporal processes. We revise the notion of event modeling and provide the mathematical foundations that underpin the existing literature on this topic. To structure our survey effectively, we introduce an ontology that categorizes the existing approaches considering three horizontal axes: modeling, inference and estimation, and application. We conduct a systematic review of the existing approaches, with a particular focus on those leveraging deep learning techniques. Finally, we delve into the practical applications where these proposed techniques can be harnessed to address real-world problems related to event modeling. Additionally, we provide a selection of benchmark datasets that can be employed to validate the approaches for point processes.

2022

The MetroPT dataset for predictive maintenance

Autores
Veloso, B; Gama, J; Ribeiro, RP; Pereira, PM;

Publicação
SCIENTIFIC DATA

Abstract
The paper describes the MetroPT data set, an outcome of a Predictive Maintenance project with an urban metro public transportation service in Porto, Portugal. The data was collected in 2022 to develop machine learning methods for online anomaly detection and failure prediction. Several analog sensor signals (pressure, temperature, current consumption), digital signals (control signals, discrete signals), and GPS information (latitude, longitude, and speed) provide a framework that can be easily used and help the development of new machine learning methods. This dataset contains some interesting characteristics and can be a good benchmark for predictive maintenance models.

2026

Turning web data into official statistics: Classifying Portuguese retail products with NLP models

Autores
Machado, JDU; Veloso, B;

Publicação
STATISTICAL JOURNAL OF THE IAOS

Abstract
The growing availability of online data creates new opportunities to improve the timeliness and detail of official statistics, particularly in domains such as price monitoring and inflation measurement. However, leveraging web-scraped data for official use requires alignment with standardized classification frameworks such as the European Classification of Individual Consumption According to Purpose (ECOICOP). We train two natural-language models, a lightweight convolutional neural network (CNN) and a fine-tuned BERTimbau transformer, to classify Portuguese food and beverage items into ECOICOP categories. Using 100,000 product titles scraped from six national supermarket sites and labeled via a human-in-the-loop workflow, the CNN reaches a macro-F1 of 92.19 % with minimal computing cost, while the transformer attains 94.00 %, the first such result for Portuguese. Both models are published on Hugging Face, enabling reproducible inference at scale while the source data remain confidential. The study delivers the first open-source Portuguese ECOICOP classifiers for food and beverage products, a replicable low-resource labeling workflow, and a benchmark of accuracy-speed trade-offs to guide researchers in similar tasks.

  • 16
  • 16