2020
Autores
Koprinska, I; Kamp, M; Appice, A; Loglisci, C; Antonie, L; Zimmermann, A; Guidotti, R; Özgöbek, O; Ribeiro, RP; Gavaldà, R; Gama, J; Adilova, L; Krishnamurthy, Y; Ferreira, PM; Malerba, D; Medeiros, I; Ceci, M; Manco, G; Masciari, E; Ras, ZW; Christen, P; Ntoutsi, E; Schubert, E; Zimek, A; Monreale, A; Biecek, P; Rinzivillo, S; Kille, B; Lommatzsch, A; Gulla, JA;
Publicação
PKDD/ECML Workshops
Abstract
2020
Autores
Teixeira, S; Gama, J; Amorim, P; Figueira, G;
Publicação
ERCIM NEWS
Abstract
Algorithmic systems based on artificial intelligence (AI) increasingly play a role in decision-making processes, both in government and industry. These systems are used in areas such as retail, finances, and manufacturing. In the latter domain, the main priority is that the solutions are interpretable, as this characteristic correlates to the adoption rate of users (e.g., schedulers). However, more recently, these systems have been applied in areas of public interest, such as education, health, public administration, and criminal justice. The adoption of these systems in this domain, in particular the data-driven decision models, has raised questions about the risks associated with this technology, from which ethical problems may emerge. We analyse two important characteristics, interpretability and trustability, of AI-based systems in the industrial and public domains, respectively.
2021
Autores
Corizzo, R; Ceci, M; Fanaee T, H; Gama, J;
Publicação
INFORMATION SCIENCES
Abstract
The increasing presence of renewable energy plants has created new challenges such as grid integration, load balancing and energy trading, making it fundamental to provide effective prediction models. Recent approaches in the literature have shown that exploiting spatio-temporal autocorrelation in data coming from multiple plants can lead to better predictions. Although tensor models and techniques are suitable to deal with spatio-temporal data, they have received little attention in the energy domain. In this paper, we propose a new method based on the Tucker tensor decomposition, capable of extracting a new feature space for the learning task. For evaluation purposes, we have investigated the performance of predictive clustering trees with the new feature space, compared to the original feature space, in three renewable energy datasets. The results are favorable for the proposed method, also when compared with state-of-the-art algorithms.
2021
Autores
Veloso, B; Gama, J; Malheiro, B;
Publicação
Encyclopedia of Information Science and Technology, Fifth Edition - Advances in Information Quality and Management
Abstract
2021
Autores
Goncalves, C; Cavalcante, L; Brito, M; Bessa, RJ; Gama, J;
Publicação
ELECTRIC POWER SYSTEMS RESEARCH
Abstract
Probabilistic forecasting of distribution tails (i.e., quantiles below 0.05 and above 0.95) is challenging for non parametric approaches since data for extreme events are scarce. A poor forecast of extreme quantiles can have a high impact in various power system decision-aid problems. An alternative approach more robust to data sparsity is extreme value theory (EVT), which uses parametric functions for modelling distribution's tails. In this work, we apply conditional EVT estimators to historical data by directly combining gradient boosting trees with a truncated generalized Pareto distribution. The parametric function parameters are conditioned by covariates such as wind speed or direction from a numerical weather predictions grid. The results for a wind power plant located in Galicia, Spain, show that the proposed method outperforms state-of-the-art methods in terms of quantile score.
2020
Autores
Bahri, M; Veloso, B; Bifet, A; Gama, J;
Publicação
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)
Abstract
The last few decades have witnessed a significant evolution of technology in different domains, changing the way the world operates, which leads to an overwhelming amount of data generated in an open-ended way as streams. Over the past years, we observed the development of several machine learning algorithms to process big data streams. However, the accuracy of these algorithms is very sensitive to their hyper-parameters, which requires expertise and extensive trials to tune. Another relevant aspect is the high-dimensionality of data, which can causes degradation to computational performance. To cope with these issues, this paper proposes a stream k-nearest neighbors (kNN) algorithm that applies an internal dimension reduction to the stream in order to reduce the resource usage and uses an automatic monitoring system that tunes dynamically the configuration of the kNN algorithm and the output dimension size with big data streams. Experiments over a wide range of datasets show that the predictive and computational performances of the kNN algorithm are improved.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.