Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Luís Torgo

2018

REBAGG: REsampled BAGGing for Imbalanced Regression

Authors
Branco, P; Torgo, L; Ribeiro, RP;

Publication
Second International Workshop on Learning with Imbalanced Domains: Theory and Applications, LIDTA@ECML/PKDD 2018, Dublin, Ireland, September 10, 2018

Abstract

2019

A review on web content popularity prediction: Issues and open challenges

Authors
Moniz, N; Torgo, L;

Publication
Online Social Networks and Media

Abstract
With the profusion of web content, researchers have avidly studied and proposed new approaches to enable the anticipation of its impact on social media, presenting many distinct approaches throughout the last decade. Diverse approaches have been presented to tackle the problem of web content popularity prediction, including standard classification and regression approaches. Furthermore, these approaches have also taken into consideration distinct scenarios of data availability, where one may target the prediction of popularity before or after the publication of the items, which is highly interesting for different objectives from a user standpoint. This work aims at reviewing previous work and discussing open issues and challenges that could foster impactful research on this topic. Five areas are identified that require further research, covering the full spectrum of the problem: social media data, the learning task, recommendation and evaluation. © 2019 Elsevier B.V.

2019

A Brief Overview on the Strategies to Fight Back the Spread of False Information

Authors
Figueira, A; Guirnaraes, N; Torgo, L;

Publication
JOURNAL OF WEB ENGINEERING

Abstract
The proliferation of false information on social networks is one of the hardest challenges in today's society, with implications capable of changing users perception on what is a fact or rumor. Due to its complexity, there has been an overwhelming number of contributions from the research community like the analysis of specific events where rumors are spread, analysis of the propagation of false content on the network, or machine learning algorithms to distinguish what is a fact and what is "fake news". In this paper, we identify and summarize some of the most prevalent works on the different categories studied. Finally, we also discuss the methods applied to deceive users and what are the next main challenges of this area.

2019

Diversity and Composition of Pelagic Prokaryotic and Protist Communities in a Thin Arctic Sea-Ice Regime

Authors
de Sousa, AGG; Tomasino, MP; Duarte, P; Fernandez Mendez, M; Assmy, P; Ribeiro, H; Surkont, J; Leite, RB; Pereira Leal, JB; Torgo, L; Magalhaes, C;

Publication
MICROBIAL ECOLOGY

Abstract
One of the most prominent manifestations of climate change is the changing Arctic sea-ice regime with a reduction in the summer sea-ice extent and a shift from thicker, perennial multiyear ice towards thinner, first-year ice. These changes in the physical environment are likely to impact microbial communities, a key component of Arctic marine food webs and biogeochemical cycles. During the Norwegian young sea ICE expedition (N-ICE2015) north of Svalbard, seawater samples were collected at the surface (5m), subsurface (20 or 50m), and mesopelagic (250m) depths on 9 March, 27 April, and 16 June 2015. In addition, several physical and biogeochemical data were recorded to contextualize the collected microbial communities. Through the massively parallel sequencing of the small subunit ribosomal RNA amplicon and metagenomic data, this work allows studying the Arctic's microbial community structure during the late winter to early summer transition. Results showed that, at compositional level, Alpha- (30.7%) and Gammaproteobacteria (28.6%) are the most frequent taxa across the prokaryotic N-ICE2015 collection, and also the most phylogenetically diverse. Winter to early summer trends were quite evident since there was a high relative abundance of thaumarchaeotes in the under-ice water column in late winter while this group was nearly absent during early summer. Moreover, the emergence of Flavobacteria and the SAR92 clade in early summer might be associated with the degradation of a spring bloom of Phaeocystis. High relative abundance of hydrocarbonoclastic bacteria, particularly Alcanivorax (54.3%) and Marinobacter (6.3%), was also found. Richness showed different patterns along the depth gradient for prokaryotic (highest at mesopelagic depth) and protistan communities (higher at subsurface depths). The microbial N-ICE2015 collection analyzed in the present study provides comprehensive new knowledge about the pelagic microbiota below drifting Arctic sea-ice. The higher microbial diversity found in late winter/early spring communities reinforces the need to continue with further studies to properly characterize the winter microbial communities under the pack-ice.

2015

Resampling strategies for regression

Authors
Torgo, L; Branco, P; Ribeiro, RP; Pfahringer, B;

Publication
EXPERT SYSTEMS

Abstract
Several real world prediction problems involve forecasting rare values of a target variable. When this variable is nominal, we have a problem of class imbalance that was thoroughly studied within machine learning. For regression tasks, where the target variable is continuous, few works exist addressing this type of problem. Still, important applications involve forecasting rare extreme values of a continuous target variable. This paper describes a contribution to this type of tasks. Namely, we propose to address such tasks by resampling approaches that change the distribution of the given data set to decrease the problem of imbalance between the rare target cases and the most frequent ones. We present two modifications of well-known resampling strategies for classification tasks: the under-sampling and the synthetic minority over-sampling technique (SMOTE) methods. These modifications allow the use of these strategies on regression tasks where the goal is to forecast rare extreme values of the target variable. In an extensive set of experiments, we provide empirical evidence for the superiority of our proposals for these particular regression tasks. The proposed resampling methods can be used with any existing regression algorithm, which means that they are general tools for addressing problems of forecasting rare extreme values of a continuous target variable.

2019

The CURE for Class Imbalance

Authors
Bellinger, C; Branco, P; Torgo, L;

Publication
Discovery Science - 22nd International Conference, DS 2019, Split, Croatia, October 28-30, 2019, Proceedings

Abstract

  • 9
  • 24