Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2017

Ensemble learning for data stream analysis: A survey

Authors
Krawczyk, B; Minku, LL; Gama, J; Stefanowski, J; Wozniak, M;

Publication
INFORMATION FUSION

Abstract
In many applications of information systems learning algorithms have to act in dynamic environments where data are collected in the form of transient data streams. Compared to static data mining, processing streams imposes new computational requirements for algorithms to incrementally process incoming examples while using limited memory and time. Furthermore, due to the non-stationary characteristics of streaming data, prediction models are often also required to adapt to concept drifts. Out of several new proposed stream algorithms, ensembles play an important role, in particular for 'non-stationary environments. This paper surveys research on ensembles for data stream classification as well as regression tasks. Besides presenting a comprehensive spectrum of ensemble approaches for data streams, we also discuss advanced learning concepts such as imbalanced data streams, novelty detection, active and semi supervised learning, complex data representations and structured outputs. The paper concludes with a discussion of open research problems and lines of future research. Published by Elsevier B.V.

2017

Proceedings of the First Workshop on Data Science for Social Good co-located with European Conference on Machine Learning and Principles and Practice of Knowledge Dicovery in Databases, SoGood@ECML-PKDD 2016, Riva del Garda, Italy, September 19, 2016

Authors
Gavaldà, Ricard; Zliobaite, Indre; Gama, Joao;

Publication
SoGood@ECML-PKDD

Abstract

2017

Credit Scoring in Microfinance Using Non-traditional Data

Authors
Ruiz, S; Gomes, P; Rodrigues, L; Gama, J;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE (EPIA 2017)

Abstract
Emerging markets contain the vast majority of the world's population. Despite the huge number of inhabitants, these markets still lack a proper finance infrastructure. One of the main difficulties felt by customers is the access to loans. This limitation arises from the fact that most customers usually lack a verifiable credit history. As such, traditional banks are unable to provide loans. This paper proposes credit scoring modeling based on non-traditional data, acquired from smartphones, for loan classification processes. We use Logistic Regression (LR) and Support Vector Machine (SVM) models which are the top performers in traditional banking. Then we compared the transformation of the training datasets creating boolean indicators against recoding using Weight of Evidence (WoE). Our models surpassed the performance of the manual loan application selection process, loans granted through the models criteria presented fewer overdues, also the approval criteria of the models increased the amount of granted loans substantially. Compared to the baseline, the loans approved by meeting the criteria of the SVM model presented -196.80% overdue rate. At the same time, the approval criteria of the SVM model generated 251.53% more loans. This paper shows that credit scoring can be useful in emerging markets. The non-traditional data can be used to build algorithms that can identify good borrowers as in traditional banking.

2017

Progress in Artificial Intelligence - 18th EPIA Conference on Artificial Intelligence, EPIA 2017, Porto, Portugal, September 5-8, 2017, Proceedings

Authors
Oliveira, Eugenio; Gama, Joao; Vale, ZitaA.; Cardoso, HenriqueLopes;

Publication
EPIA

Abstract

2017

Computational Models for Social and Technical Interactions

Authors
Gama, J; Oliveira, E; Cardoso, HL;

Publication
NEW GENERATION COMPUTING

Abstract

2017

Clustering from Data Streams

Authors
Gama, J;

Publication
Encyclopedia of Machine Learning and Data Mining

Abstract
Clustering is one of the most popular data mining techniques. In this article, we review the relevant methods and algorithms for designing cluster algorithms under the data streams computational model, and discuss research directions in tracking evolving clusters. © Springer Science+Business Media New York 2011, 2017

  • 249
  • 503