Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Pedro Manuel Ribeiro

2022

Preface

Authors
Ribeiro, P; Silva, F; Mendes, JF; Laureano, R;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract

2021

Similarity of Football Players Using Passing Sequences

Authors
Barbosa, A; Ribeiro, P; Dutra, I;

Publication
MLSA@PKDD/ECML

Abstract
Association football has been the subject of many research studies. In this work we present a study on player similarity using passing sequences extracted from games from the top-5 European football leagues during the 2017/2018 season. We present two different approaches: first, we only count the motifs a player is involved in; then we also take into consideration the specific position a player occupies in each motif. We also present a new way to objectively judge the quality of the generated models in football analytics. Our results show that the study of passing sequences can be used to study player similarity with relative success.

2022

Novel features for time series analysis: a complex networks approach

Authors
Silva, VF; Silva, ME; Ribeiro, P; Silva, F;

Publication
DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Being able to capture the characteristics of a time series with a feature vector is a very important task with a multitude of applications, such as classification, clustering or forecasting. Usually, the features are obtained from linear and nonlinear time series measures, that may present several data related drawbacks. In this work we introduce NetF as an alternative set of features, incorporating several representative topological measures of different complex networks mappings of the time series. Our approach does not require data preprocessing and is applicable regardless of any data characteristics. Exploring our novel feature vector, we are able to connect mapped network features to properties inherent in diversified time series models, showing that NetF can be useful to characterize time data. Furthermore, we also demonstrate the applicability of our methodology in clustering synthetic and benchmark time series sets, comparing its performance with more conventional features, showcasing how NetF can achieve high-accuracy clusters. Our results are very promising, with network features from different mapping methods capturing different properties of the time series, adding a different and rich feature set to the literature.

2018

TensorCast: Forecasting time-evolving networks with contextual information

Authors
Araújo M.; Ribeiro P.; Faloutsos C.;

Publication
IJCAI International Joint Conference on Artificial Intelligence

Abstract
Can we forecast future connections in a social network? Can we predict who will start using a given hashtag in Twitter, leveraging contextual information such as who follows or retweets whom to improve our predictions? In this paper we present an abridged report of TENSORCAST, a method for forecasting time-evolving networks, that uses coupled tensors to incorporate multiple information sources. TENSORCAST is scalable (linearithmic on the number of connections), effective (more precise than competing methods) and general (applicable to any data source representable by a tensor). We also showcase our method when applied to forecast two large scale heterogeneous real world temporal networks, namely Twitter and DBLP.

2025

Evaluating Transfer Learning Methods on Real-World Data Streams: A Case Study in Financial Fraud Detection

Authors
Pereira, RR; Bono, J; Ferreira, HM; Ribeiro, P; Soares, C; Bizarro, P;

Publication
ECML/PKDD (9)

Abstract
When the available data for a target domain is limited, transfer learning (TL) methods leverage related data-rich source domains to train and evaluate models, before deploying them on the target domain. However, most TL methods assume fixed levels of labeled and unlabeled target data, which contrasts with real-world scenarios where both data and labels arrive progressively over time. As a result, evaluations based on these static assumptions may not reflect how methods perform in practice. To support a more realistic assessment of TL methods in dynamic settings, we propose an evaluation framework that (1) simulates varying data availability over time, (2) creates multiple domains via resampling of a given dataset and (3) introduces inter-domain variability through controlled transformations, e.g., including time-dependent covariate and concept shifts. These capabilities enable the systematic simulation of a large number of variants of the experiments, providing deeper insights into how algorithms may behave when deployed. We demonstrate the usefulness of the proposed framework by performing a case study on a proprietary real-world suite of card payment datasets. To support reproducibility, we also apply the framework on the publicly available Bank Account Fraud (BAF) dataset. By providing a methodology for evaluating TL methods over time and in different data availability conditions, our framework supports a better understanding of model behavior in real-world environments, which enables more informed decisions when deploying models in new domains.

2025

Studying and Improving Graph Neural Network-based Motif Estimation

Authors
Vieira, PC; Silva, MEP; Pinto Ribeiro, PM;

Publication
CoRR

Abstract

  • 7
  • 13