Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by CRACS

2025

Multilayer quantile graph for multivariate time series analysis and dimensionality reduction

Authors
Silva, VF; Silva, ME; Ribeiro, P; Silva, F;

Publication
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS

Abstract
In recent years, there has been a surge in the prevalence of high- and multidimensional temporal data across various scientific disciplines. These datasets are characterized by their vast size and challenging potential for analysis. Such data typically exhibit serial and cross-dependency and possess high dimensionality, thereby introducing additional complexities to conventional time series analysis methods. To address these challenges, a recent and complementary approach has emerged, known as network-based analysis methods for multivariate time series. In univariate settings, quantile graphs have been employed to capture temporal transition properties and reduce data dimensionality by mapping observations to a smaller set of sample quantiles. To confront the increasingly prominent issue of high dimensionality, we propose an extension of quantile graphs into a multivariate variant, which we term Multilayer Quantile Graphs. In this innovative mapping, each time series is transformed into a quantile graph, and inter-layer connections are established to link contemporaneous quantiles of pairwise series. This enables the analysis of dynamic transitions across multiple dimensions. In this study, we demonstrate the effectiveness of this new mapping using synthetic and benchmark multivariate time series datasets. We delve into the resulting network's topological structures, extract network features, and employ these features for original dataset analysis. Furthermore, we compare our results with a recent method from the literature. The resulting multilayer network offers a significant reduction in the dimensionality of the original data while capturing serial and cross-dimensional transitions. This approach facilitates the characterization and analysis of large multivariate time series datasets through network analysis techniques.

2025

Multilayer horizontal visibility graphs for multivariate time series analysis

Authors
Silva, VF; Silva, ME; Ribeiro, P; Silva, F;

Publication
DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Multivariate time series analysis is a vital but challenging task, with multidisciplinary applicability, tackling the characterization of multiple interconnected variables over time and their dependencies. Traditional methodologies often adapt univariate approaches or rely on assumptions specific to certain domains or problems, presenting limitations. A recent promising alternative is to map multivariate time series into high-level network structures such as multiplex networks, with past work relying on connecting successive time series components with interconnections between contemporary timestamps. In this work, we first define a novel cross-horizontal visibility mapping between lagged timestamps of different time series and then introduce the concept of multilayer horizontal visibility graphs. This allows describing cross-dimension dependencies via inter-layer edges, leveraging the entire structure of multilayer networks. To this end, a novel parameter-free topological measure is proposed and common measures are extended for the multilayer setting. Our approach is general and applicable to any kind of multivariate time series data. We provide an extensive experimental evaluation with both synthetic and real-world datasets. We first explore the proposed methodology and the data properties highlighted by each measure, showing that inter-layer edges based on cross-horizontal visibility preserve more information than previous mappings, while also complementing the information captured by commonly used intra-layer edges. We then illustrate the applicability and validity of our approach in multivariate time series mining tasks, showcasing its potential for enhanced data analysis and insights.

2025

Osiris: A Multi-Language Transpiler for Educational Purposes

Authors
Marrão, B; Leal, JP; Queirós, R;

Publication
6th International Computer Programming Education Conference, ICPEC 2025, July 10-11, 2025, PORTIC, Polytechnic of Porto, Portugal

Abstract

2025

Designing a Multi-Narrative Gamified Learning Experience

Authors
Bauer, Y; Leal, JP; Queirós, R; Swacha, J; Paiva, JC;

Publication
6th International Computer Programming Education Conference, ICPEC 2025, July 10-11, 2025, PORTIC, Polytechnic of Porto, Portugal

Abstract

2025

PAP900: A dataset of semantic relationships between affective words in Portuguese

Authors
dos Santos, AF; Leal, JP; Alves, RA; Jacques, T;

Publication
DATA IN BRIEF

Abstract
The PAP900 dataset centers on the semantic relationship between affective words in Portuguese. It contains 900 word pairs, each annotated by at least 30 human raters for both semantic similarity and semantic relatedness. In addition to the semantic ratings, the dataset includes the word categorization used to build the word pairs and detailed sociodemographic information about annotators, enabling the analysis of the influence of personal factors on the perception of semantic relationships. Furthermore, this article describes in detail the dataset construction process, from word selection to agreement metrics. Data was collected from Portuguese university psychology students, who completed two rounds of questionnaires. In the first round annotators were asked to rate word pairs on either semantic similarity or relatedness. The second round switched the relation type for most annotators, with a small percentage being asked to repeat the same relation. The instructions given emphasized the differences between semantic relatedness and semantic similarity, and provided examples of expected ratings of both. There are few semantic relations datasets in Portuguese, and none focusing on affective words. PAP900 is distributed in distinct formats to be easy to use for both researchers just looking for the final averaged values and for researchers looking to take advantage of the individual ratings, the word categorization and the annotator data. This dataset is a valuable resource for researchers in computational linguistics, natural language processing, psychology, and cognitive science. (c) 2025TheAuthors.

2025

Machine Learning Models for Indoor Positioning Using Bluetooth RSSI and Video Data: A Case Study

Authors
Mamede, T; Silva, N; Marques, ERB; Lopes, LMB;

Publication
SENSORS

Abstract
Indoor Positioning Systems (IPSs) are essential for applications requiring accurate location awareness in indoor environments. However, achieving high precision remains challenging due to signal interference and environmental variability. This study proposes a multimodal IPS that integrates Bluetooth Received Signal Strength Indicator (RSSI) measurements and video imagery using machine learning (ML) and ensemble learning techniques. The system was implemented and deployed in the Hall of Biodiversity at the Natural History and Science Museum of the University of Porto. The venue presented significant deployment issues, namely restrictions on beacon placement and lighting conditions. We trained independent ML models on RSSI and video datasets, and combined them through ensemble learning methods. The experimental results from test scenarios, which included simulated visitor trajectories, showed that ensemble models consistently outperformed the RSSI-based and video-based models. These findings demonstrate that the use of multimodal data can significantly improve IPS accuracy despite constraints such as multipath interference, low lighting, and limited beacon infrastructure. Overall, they highlight the potential of multimodal data for deployments in complex indoor environments.

  • 3
  • 208