Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2023

Preface

Authors
Brito, P; Dias, G; Lausen, B; Montanari, A; Nugent, R;

Publication
Studies in Classification, Data Analysis, and Knowledge Organization

Abstract
[No abstract available]

2023

Wavelet-based fuzzy clustering of interval time series

Authors
D'Urso, P; De Giovanni, L; Maharaj, EA; Brito, P; Teles, P;

Publication
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING

Abstract
We investigate the fuzzy clustering of interval time series using wavelet variances and covariances; in particular, we use a fuzzy c-medoids clustering algorithm. Traditional hierarchical and non-hierarchical clustering methods lead to the identification of mutually exclusive clusters whereas fuzzy clustering methods enable the identification of overlapping clusters, implying that one or more series could belong to more than one cluster simultaneously. An interval time series (ITS) which arises when interval-valued observa-tions are recorded over time is able to capture the variability of values within each interval at each time point. This is in contrast to single-point information available in a classical time series. Our main contribution is that by combining wavelet analysis, interval data analysis and fuzzy clustering, we are able to capture information which would otherwise have not been contemplated by the use of traditional crisp clustering methods on classical time series for which just a single value is recorded at each time point. Through simulation studies, we show that under some circumstances fuzzy c-medoids clustering performs better when applied to ITS than when it is applied to the corresponding traditional time series. Applications to exchange rates ITS and sea-level ITS show that the fuzzy clustering method reveals different and more meaningful results than when applied to associated single-point time series.

2023

Machine Learning Data Markets: Evaluating the Impact of Data Exchange on the Agent Learning Performance

Authors
Baghcheband, H; Soares, C; Reis, LP;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I

Abstract
In recent years, the increasing availability of distributed data has led to a growing interest in transfer learning across multiple nodes. However, local data may not be adequate to learn sufficiently accurate models, and the problem of learning from multiple distributed sources remains a challenge. To address this issue, Machine Learning Data Markets (MLDM) have been proposed as a potential solution. In MLDM, autonomous agents exchange relevant data in a cooperative relationship to improve their models. Previous research has shown that data exchange can lead to better models, but this has only been demonstrated with only two agents. In this paper, we present an extended evaluation of a simple version of the MLDM framework in a collaborative scenario. Our experiments show that data exchange has the potential to improve learning performance, even in a simple version of MLDM. The findings conclude that there exists a direct correlation between the number of agents and the gained performance, while an inverse correlation was observed between the performance and the data batch sizes. The results of this study provide important insights into the effectiveness of MLDM and how it can be used to improve learning performance in distributed systems. By increasing the number of agents, a more efficient system can be achieved, while larger data batch sizes can decrease the global performance of the system. These observations highlight the importance of considering both the number of agents and the data batch sizes when designing distributed learning systems using the MLDM framework.

2023

tsMorph: generation of semi-synthetic time series to understand algorithm performance

Authors
dos Santos, MR; de Carvalho, ACPLF; Soares, C;

Publication
CoRR

Abstract

2023

Federated Learning for Computer-Aided Diagnosis of Glaucoma Using Retinal Fundus Images

Authors
Baptista, T; Soares, C; Oliveira, T; Soares, F;

Publication
APPLIED SCIENCES-BASEL

Abstract
Deep learning approaches require a large amount of data to be transferred to centralized entities. However, this is often not a feasible option in healthcare, as it raises privacy concerns over sharing sensitive information. Federated Learning (FL) aims to address this issue by allowing machine learning without transferring the data to a centralized entity. FL has shown great potential to ensure privacy in digital healthcare while maintaining performance. Despite this, there is a lack of research on the impact of different types of data heterogeneity on the results. In this study, we research the robustness of various FL strategies on different data distributions and data quality for glaucoma diagnosis using retinal fundus images. We use RetinaQualEvaluator to generate quality labels for the datasets and then a data distributor to achieve our desired distributions. Finally, we evaluate the performance of the different strategies on local data and an independent test dataset. We observe that federated learning shows the potential to enable high-performance models without compromising sensitive data. Furthermore, we infer that FedProx is more suitable to scenarios where the distributions and quality of the data of the participating clients is diverse with less communication cost.

2023

Exploring the Reduction of Configuration Spaces of Workflows

Authors
Freitas, F; Brazdil, P; Soares, C;

Publication
Discovery Science - 26th International Conference, DS 2023, Porto, Portugal, October 9-11, 2023, Proceedings

Abstract
Many current AutoML platforms include a very large space of alternatives (the configuration space) that make it difficult to identify the best alternative for a given dataset. In this paper we explore a method that can reduce a large configuration space to a significantly smaller one and so help to reduce the search time for the potentially best workflow. We empirically validate the method on a set of workflows that include four ML algorithms (SVM, RF, LogR and LD) with different sets of hyperparameters. Our results show that it is possible to reduce the given space by more than one order of magnitude, from a few thousands to tens of workflows, while the risk that the best workflow is eliminated is nearly zero. The system after reduction is about one order of magnitude faster than the original one, but still maintains the same predictive accuracy and loss. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

  • 69
  • 515