Cookies Policy
We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out More
Close
  • Menu
About

About

Ricardo Sousa has worked in the fields of signal processing with experience in research,
development and technical projects. His current interests lie in signal processing, machine
learning and data mining.

Interest
Topics
Details

Details

  • Name

    Ricardo Teixeira Sousa
  • Cluster

    Computer Science
  • Role

    Researcher
  • Since

    16th September 2005
002
Publications

2018

Multi-label classification from high-speed data streams with adaptive model rules and random rules

Authors
Sousa, R; Gama, J;

Publication
Progress in Artificial Intelligence

Abstract

2018

Co-training study for online regression

Authors
Sousa, R; Gama, J;

Publication
Proceedings of the 33rd Annual ACM Symposium on Applied Computing, SAC 2018, Pau, France, April 09-13, 2018

Abstract
This paper describes the development of a Co-training (semi-supervised approach) method that uses multiple learners for single target regression on data streams. The experimental evaluation was focused on the comparison between a realistic supervised scenario (all unlabelled examples are discarded) and scenarios where unlabelled examples are used to improve the regression model. Results present fair evidences of error measure reduction by using the proposed Co-training method. However, the error reduction still is relatively small. © 2018 Authors.

2017

Comparison Between Co-training and Self-training for Single-target Regression in Data Streams using AMRules

Authors
Sousa, R; Gama, J;

Publication
Proceedings of the Workshop on IoT Large Scale Learning from Data Streams co-located with the 2017 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2017), Skopje, Macedonia, September 18-22, 2017.

Abstract
A comparison between co-training and self-training method for single-target regression based on multiples learners is performed. Data streaming systems can create a significant amount of unlabeled data which is caused by label assignment impossibility, high cost of labeling or labeling long duration tasks. In supervised learning, this data is wasted. In order to take advantaged from unlabeled data, semi-supervised approaches such as Co-training and Self-training have been created to benefit from input information that is contained in unlabeled data. However, these approaches have been applied to classification and batch training scenarios. Due to these facts, this paper presents a comparison between Co-training and Self-learning methods for single-target regression in data streams. Rules learning is used in this context since this methodology enables to explore the input information. The experimental evaluation consisted of a comparison between the real standard scenario where all unlabeled data is rejected and scenarios where unlabeled data is used to improve the regression model. Results show evidences of better performance in terms of error reduction and in high level of unlabeled examples in the stream. Despite this fact, the improvements are not expressive.

2017

Co-training Semi-supervised Learning for Single-Target Regression in Data Streams Using AMRules

Authors
Sousa, R; Gama, J;

Publication
Foundations of Intelligent Systems - 23rd International Symposium, ISMIS 2017, Warsaw, Poland, June 26-29, 2017, Proceedings

Abstract

2016

Online Multi-label Classification with Adaptive Model Rules

Authors
Sousa, R; Gama, J;

Publication
ADVANCES IN ARTIFICIAL INTELLIGENCE, CAEPIA 2016

Abstract
The interest on online classification has been increasing due to data streams systems growth and the need for Multi-label Classification applications have followed the same trend. However, most of classification methods are not performed on-line. Moreover, data streams produce huge amounts of data and the available processing resources may not be sufficient. This work-in-progress paper proposes an algorithm for Multi-label Classification applications in data streams scenarios. The proposed method is derived from multi-target structured regressor AMRules that produces models using subsets of output attributes (output specialization strategy). Performance tests were conducted where the operation modes global, local and subset approaches of the proposed method were compared to each other and to others online multi-label classifiers described in the literature. Three datasets of real scenarios were used for evaluation. The results indicate that the subset specialization mode is competitive in comparison to local and global approaches and to other online multi-label classifiers.