Publications

Publications by LIAAD

2023

First insight into oral microbiome diversity in Papua New Guineans reveals a specific regional signature

Authors
Pedro, N; Brucato, N; Cavadas, B; Lisant, V; Camacho, R; Kinipi, C; Leavesley, M; Pereira, L; Ricaut, FX;

Publication
MOLECULAR ECOLOGY

Abstract
The oral microbiota is a highly complex and diversified part of the human microbiome. Being located at the interface between the human body and the exterior environment, this microbiota can deepen our understanding of the environmental impacts on the global status of human health. This research topic has been well addressed in Westernized populations, but these populations only represent a fraction of human diversity. Papua New Guinea hosts very diverse environments and one of the most unique human biological diversities worldwide. In this study we performed the first known characterization of the oral microbiome in 85 Papua New Guinean individuals living in different environments, using a qualitative and quantitative approach. We found a significant geographical structure of the Papua New Guineans oral microbiome, especially in the groups most isolated from urban spaces. In comparison to other global populations, two bacterial genera related to iron absorption were significantly more abundant in Papua New Guineans and Aboriginal Australians, which suggests a shared oral microbiome signature. Further studies will be needed to confirm and explore this possible regional-specific oral microbiome profile.

CloseRead Abstract

2023

An Inductive Logic Programming Approach for Entangled Tube Modeling in Bin Picking

Authors
Leao, G; Camacho, R; Sousa, A; Veiga, G;

Publication
ROBOT2022: FIFTH IBERIAN ROBOTICS CONFERENCE: ADVANCES IN ROBOTICS, VOL 2

Abstract
Bin picking is a challenging problem that involves using a robotic manipulator to remove, one-by-one, a set of objects randomly stacked in a container. When the objects are prone to entanglement, having an estimation of their pose and shape is highly valuable for more reliable grasp and motion planning. This paper focuses on modeling entangled tubes with varying degrees of curvature. An unconventional machine learning technique, Inductive Logic Programming (ILP), is used to construct sets of rules (theories) capable of modeling multiple tubes when given the cylinders that constitute them. Datasets of entangled tubes are created via simulation in Gazebo. Experiments using Aleph and SWI-Prolog illustrate how ILP can build explainable theories with a high performance, using a relatively small dataset and low amount of time for training. Therefore, this work serves as a proof-of-concept that ILP is a valuable method to acquire knowledge and validate heuristics for pose and shape estimation in complex bin picking scenarios.

CloseRead Abstract

2023

Interpreting What is Important: An Explainability Approach and Study on Feature Selection

Authors
Rodrigues, EM; Baghoussi, Y; Mendes-Moreira, J;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I

Abstract
Machine learning models are widely used in time series forecasting. One way to reduce its computational cost and increase its efficiency is to select only the relevant exogenous features to be fed into the model. With this intention, a study on the feature selection methods: Pearson correlation coefficient, Boruta, Boruta-Shap, IMV-LSTM, and LIME is performed. A new method focused on interpretability, SHAP-LSTM, is proposed, using a deep learning model training process as part of a feature selection algorithm. The methods were compared in 2 different datasets showing comparable results with lesser computational cost when compared with the use of all features. In all datasets, SHAP-LSTM showed competitive results, having comparatively better results on the data with a higher presence of scarce occurring categorical features.

CloseRead Abstract

2023

Studying the Impact of Sampling in Highly Frequent Time Series

Authors
Ferreira, PJS; Mendes-Moreira, J; Rodrigues, A;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I

Abstract
Nowadays, all kinds of sensors generate data, and more metrics are being measured. These large quantities of data are stored in large data centers and used to create datasets to train Machine Learning algorithms for most different areas. However, processing that data and training the Machine Learning algorithms require more time, and storing all the data requires more space, creating a Big Data problem. In this paper, we propose simple techniques for reducing large time series datasets into smaller versions without compromising the forecasting capability of the generated model and, simultaneously, reducing the time needed to train the models and the space required to store the reduced sets. We tested the proposed approach in three public and one private dataset containing time series with different characteristics. The results show, for the datasets studied that it is possible to use reduced sets to train the algorithms without affecting the forecasting capability of their models. This approach is more efficient for datasets with higher frequencies and larger seasonalities. With the reduced sets, we obtain decreases in the training time between 40 and 94% and between 46 and 65% for the memory needed to store the reduced sets.

CloseRead Abstract

2023

A Study on Hyperparameters Configurations for an Efficient Human Activity Recognition System

Authors
Ferreira, PJS; Mendes Moreira, J; Cardoso, JMP;

Publication
PROCEEDINGS OF THE 8TH INTERNATIONAL WORKSHOP ON SENSOR-BASED ACTIVITY RECOGNITION AND ARTIFICIAL INTELLIGENCE, IWOAR 2023

Abstract
Human Activity Recognition (HAR) has been a popular research field due to the widespread of devices with sensors and computational power (e.g., smartphones and smartwatches). Applications for HAR systems have been extensively researched in recent literature, mainly due to the benefits of improving quality of life in areas like health and fitness monitoring. However, since persons have different motion patterns when performing physical activities, a HAR system would need to adapt to the characteristics of the user in order to maintain or improve accuracy. Mobile devices, such as smartphones, used to implement HAR systems, have limited resources (e.g., battery life). They also have difficulty adapting to the device's constraints to work efficiently for long periods. In this work, we present a kNN-based HAR system and an extensive study of the influence of hyperparameters (window size, overlap, distance function, and the value of k) and parameters (sampling frequency) on the system accuracy, energy consumption, and response time. We also study how hyperparameter configurations affect the model's performance for the users and the activities. Experimental results show that adapting the hyperparameters makes it possible to adjust the system's behavior to the user, the device, and the target service. These results motivate the development of a HAR system capable of automatically adapting the hyperparameters for the user, the device, and the service.

CloseRead Abstract

2023

DyGCN-LSTM: A dynamic GCN-LSTM based encoder-decoder framework for multistep traffic prediction

Authors
Kumar, R; Moreira, JM; Chandra, J;

Publication
APPLIED INTELLIGENCE

Abstract
Intelligent transportation systems (ITS) are gaining attraction in large cities for better traffic management. Traffic forecasting is an important part of ITS, but a difficult one due to the intricate spatiotemporal relationships of traffic between different locations. Despite the fact that remote or far sensors may have temporal and spatial similarities with the predicting sensor, existing traffic forecasting research focuses primarily on modeling correlations between neighboring sensors while disregarding correlations between remote sensors. Furthermore, existing methods for capturing spatial dependencies, such as graph convolutional networks (GCNs), are unable to capture the dynamic spatial dependence in traffic systems. Self-attention-based techniques for modeling dynamic correlations of all sensors currently in use overlook the hierarchical features of roads and have quadratic computational complexity. Our paper presents a new Dynamic Graph Convolution LSTM Network (DyGCN-LSTM) to address the aforementioned limitations. The novelty of DyGCN-LSTM is that it can model the underlying non-linear spatial and temporal correlations of remotely located sensors at the same time. Experimental investigations conducted using four real-world traffic data sets show that the suggested approach is superior to state-of-the-art benchmarks by 25% in terms of RMSE.

CloseRead Abstract