Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2024

Machine Learning Data Market Based on Multiagent Systems

Authors
Baghcheband, H; Soares, C; Reis, LP;

Publication
IEEE INTERNET COMPUTING

Abstract
Today, autonomous agents, the Internet of Things, and smart devices produce more and more distributed data and use them to learn models for different purposes. One challenge is that learning from local data only may lead to suboptimal models. Thus, better models are expected if agents can exchange data, leading to approaches such as federated learning. However, these approaches assume that data have no value and, thus, is exchanged for free. A machine learning data market (MLDM), a framework based on multiagent systems with a market-based perspective on data exchange, was recently proposed. In an MLDM, each agent trains its model based on both local data and data bought from other agents. Although the empirical results are interesting, several challenges are still open, including data acquisition and data valuation. The MLDM is an illustrative example of how the value of data can and should be integrated into the design of distributed ML systems.

2024

Corrector LSTM: built-in training data correction for improved time-series forecasting

Authors
Baghoussi, Y; Soares, C; Moreira, JM;

Publication
Neural Comput. Appl.

Abstract
Traditional recurrent neural networks (RNNs) are essential for processing time-series data. However, they function as read-only models, lacking the ability to directly modify the data they learn from. In this study, we introduce the corrector long short-term memory (cLSTM), a Read & Write LSTM architecture that not only learns from the data but also dynamically adjusts it when necessary. The cLSTM model leverages two key components: (a) predicting LSTM’s cell states using Seasonal Autoregressive Integrated Moving Average (SARIMA) and (b) refining the training data based on discrepancies between actual and forecasted cell states. Our empirical validation demonstrates that cLSTM surpasses read-only LSTM models in forecasting accuracy across the Numenta Anomaly Benchmark (NAB) and M4 Competition datasets. Additionally, cLSTM exhibits superior performance in anomaly detection compared to hierarchical temporal memory (HTM) models. © The Author(s) 2024.

2024

Shapley-Based Data Valuation Method for the Machine Learning Data Markets (MLDM)

Authors
Baghcheband, H; Soares, C; Reis, LP;

Publication
FOUNDATIONS OF INTELLIGENT SYSTEMS, ISMIS 2024

Abstract
Data valuation, the process of assigning value to data based on its utility and usefulness, is a critical and largely unexplored aspect of data markets. Within the Machine Learning Data Market (MLDM), a platform that enables data exchange among multiple agents, the challenge of quantifying the value of data becomes particularly prominent. Agents within MLDM are motivated to exchange data based on its potential impact on their individual performance. Shapley Value-based methods have gained traction in addressing this challenge, prompting our study to investigate their effectiveness within the MLDM context. Specifically, we propose the Gain Data Shapley Value (GDSV) method tailored for MLDM and compare it to the original data valuation method used in MLDM. Our analysis focuses on two common learning algorithms, Decision Tree (DT) and K-nearest neighbors (KNN), within a simulated society of five agents, tested on 45 classification datasets. results show that the GDSV leads to incremental improvements in predictive performance across both DT and KNN algorithms compared to performance-based valuation or the baseline. These findings underscore the potential of Shapley Value-based methods in identifying high-value data within MLDM while indicating areas for further improvement.

2024

Kernel Corrector LSTM

Authors
Tuna, R; Baghoussi, Y; Soares, C; Mendes-Moreira, J;

Publication
ADVANCES IN INTELLIGENT DATA ANALYSIS XXII, PT II, IDA 2024

Abstract
Forecasting methods are affected by data quality issues in two ways: 1. they are hard to predict, and 2. they may affect the model negatively when it is updated with new data. The latter issue is usually addressed by pre-processing the data to remove those issues. An alternative approach has recently been proposed, Corrector LSTM (cLSTM), which is a Read & Write Machine Learning (RW-ML) algorithm that changes the data while learning to improve its predictions. Despite promising results being reported, cLSTM is computationally expensive, as it uses a meta-learner to monitor the hidden states of the LSTM. We propose a new RW-ML algorithm, Kernel Corrector LSTM (KcLSTM), that replaces the meta-learner of cLSTM with a simpler method: Kernel Smoothing. We empirically evaluate the forecasting accuracy and the training time of the new algorithm and compare it with cLSTM and LSTM. Results indicate that it is able to decrease the training time while maintaining a competitive forecasting accuracy.

2024

Association of Grad-CAM, LIME and Multidimensional Fractal Techniques for the Classification of H&E Images

Authors
Lopes, TRS; Roberto, GF; Soares, C; Tosta, TAA; Silva, AB; Loyola, AM; Cardoso, SV; de Faria, PR; do Nascimento, MZ; Neves, LA;

Publication
Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2024, Volume 2: VISAPP, Rome, Italy, February 27-29, 2024.

Abstract
In this work, a method based on the use of explainable artificial intelligence techniques with multiscale and multidimensional fractal techniques is presented in order to investigate histological images stained with Hematoxylin-Eosin. The CNN GoogLeNet neural activation patterns were explored, obtained from the gradient-weighted class activation mapping and locally-interpretable model-agnostic explanation techniques. The feature vectors were generated with multiscale and multidimensional fractal techniques, specifically fractal dimension, lacunarity and percolation. The features were evaluated by ranking each entry, using the ReliefF algorithm. The discriminative power of each solution was defined via classifiers with different heuristics. The best results were obtained from LIME, with a significant increase in accuracy and AUC rates when compared to those provided by GoogLeNet. The details presented here can contribute to the development of models aimed at the classification of histological images. © 2024 by SCITEPRESS – Science and Technology Publications, Lda.

2024

Detection of Covid-19 in Chest X-Ray Images Using Percolation Features and Hermite Polynomial Classification

Authors
Roberto, GF; Pereira, DC; Martins, AS; Tosta, TAA; Soares, C; Lumini, A; Rozendo, GB; Neves, LA; Nascimento, MZ;

Publication
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I

Abstract
Covid-19 is a serious disease caused by the Sars-CoV-2 virus that has been first reported in China at late 2019 and has rapidly spread around the world. As the virus affects mostly the lungs, chest X-rays are one of the safest and most accessible ways of diagnosing the infection. In this paper, we propose the use of an approach for detecting Covid-19 in chest X-ray images through the extraction and classification of local and global percolation-based features. The method was applied in two datasets: one containing 2,002 segmented samples split into two classes (Covid-19 and Healthy); and another containing 1,125 non-segmented samples split into three classes (Covid-19, Healthy and Pneumonia). The 48 obtained percolation features were given as input to six different classifiers and then AUC and accuracy values were evaluated. We employed the 10-fold cross-validation method and evaluated the lesion sub-types with binary and multiclass classification using the Hermite Polynomial classifier, which had never been employed in this context. This classifier provided the best overall results when compared to other five machine learning algorithms. These results based in the association of percolation features and Hermite polynomial can contribute to the detection of the lesions by supporting specialists in clinical practices.

  • 40
  • 515