Publications

Publications by João Gama

2018

Dynamic graph summarization: a tensor decomposition approach

Authors
Fernandes, S; Fanaee T, H; Gama, J;

Publication
DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Due to the scale and complexity of todays' social networks, it becomes infeasible to mine them with traditional approaches. A possible solution to reduce such scale and complexity is to produce a compact (lossy) version of the network that represents its major properties. This task is known as graph summarization, which is the subject of this research. Our focus is on time-evolving graphs, a more complex scenario where the dynamics of the network also should be taken into account. We address this problem using tensor decomposition, which enables us to capture the multi-way structure of the time-evolving network. This property is unique and is impossible to obtain with other approaches such as matrix factorization. Experimental evaluation on five real world networks implies promising results demonstrating that tensor decomposition is quite useful for summarizing dynamic networks.

CloseRead Abstract

2018

Social network analysis: An overview

Authors
Tabassum, S; Pereira, FSF; Fernandes, S; Gama, J;

Publication
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Social network analysis (SNA) is a core pursuit of analyzing social networks today. In addition to the usual statistical techniques of data analysis, these networks are investigated using SNA measures. It helps in understanding the dependencies between social entities in the data, characterizing their behaviors and their effect on the network as a whole and over time. Therefore, this article attempts to provide a succinct overview of SNA in diverse topological networks (static, temporal, and evolving networks) and perspective (ego-networks). As one of the primary applicability of SNA is in networked data mining, we provide a brief overview of network mining models as well; by this, we present the readers with a concise guided tour from analysis to mining of networks. This article is categorized under: Application Areas > Science and Technology Technologies > Machine Learning Fundamental Concepts of Data and Knowledge > Human Centricity and User Interaction Commercial, Legal, and Ethical Issues > Social Considerations

CloseRead Abstract

2018

Online bagging for recommender systems

Authors
Vinagre, J; Jorge, AM; Gama, J;

Publication
EXPERT SYSTEMS

Abstract
Ensemble methods have been successfully used in the past to improve recommender systems; however, they have never been studied with incremental recommendation algorithms. Many online recommender systems deal with continuous, potentially fast, and unbounded flows of databig data streamsand often need to be responsive to fresh user feedback, adjusting recommendations accordingly. This is clear in tasks such as social network feeds, news recommender systems, automatic playlist completion, and other similar applications. Batch ensemble approaches are not suitable to perform continuous learning, given the complexity of retraining new models on demand. In this paper, we adapt a general purpose online bagging algorithm for top-N recommendation tasks and propose two novel online bagging methods specifically tailored for recommender systems. We evaluate the three approaches, using an incremental matrix factorization algorithm for top-N recommendation with positive-only user feedback data as the base model. Our results show that online bagging is able to improve accuracy up to 55% over the baseline, with manageable computational overhead.

CloseRead Abstract

2019

Processing Evolving Social Networks for Change Detection Based on Centrality Measures

Authors
Pereira, FSF; Tabassum, S; Gama, J; de Amo, S; Oliveira, GMB;

Publication
Studies in Big Data

Abstract
Social networks have an evolving characteristic due to the continuous interaction between users, with nodes associating and disassociating with each other as time flies. The analysis of such networks is especially challenging, because it needs to be performed with an online approach, under the one-pass constraint of data streams. Such evolving behavior leads to changes in the network topology that can be investigated under different perspectives. In this work we focus on the analysis of nodes position evolution—a node-centric perspective. Our goal is to spot change-points in an evolving network at which a node deviates from its normal behavior. Therefore, we propose a change detection model for processing evolving network streams which employs three different aggregating mechanisms for tracking the evolution of centrality metrics of a node. Our model is space and time efficient with memory less mechanisms and in other mechanisms at most we require the network of current time step T only. Additionally, we also compare the influence on different centralities’ fluctuations by the dynamics of real-world preferences. Consecutively, we apply our model in the user preference change detection task, reaching competitive levels of accuracy on Twitter network. © 2019, Springer International Publishing AG, part of Springer Nature.

CloseRead Abstract

2018

Online Gradient Boosting for Incremental Recommender Systems

Authors
Vinagre, J; Jorge, AM; Gama, J;

Publication
DS

Abstract
Ensemble models have been proven successful for batch recommendation algorithms, however they have not been well studied in streaming applications. Such applications typically use incremental learning, to which standard ensemble techniques are not trivially applicable. In this paper, we study the application of three variants of online gradient boosting to top-N recommendation tasks with implicit data, in a streaming data environment. Weak models are built using a simple incremental matrix factorization algorithm for implicit feedback. Our results show a significant improvement of up to 40% over the baseline standalone model. We also show that the overhead of running multiple weak models is easily manageable in stream-based applications.

CloseRead Abstract

2018

Self Hyper-Parameter Tuning for Data Streams

Authors
Veloso, B; Gama, J; Malheiro, B;

Publication
DS

Abstract
The widespread usage of smart devices and sensors together with the ubiquity of the Internet access is behind the exponential growth of data streams. Nowadays, there are hundreds of machine learning algorithms able to process high-speed data streams. However, these algorithms rely on human expertise to perform complex processing tasks like hyper-parameter tuning. This paper addresses the problem of data variability modelling in data streams. Specifically, we propose and evaluate a new parameter tuning algorithm called Self Parameter Tuning (SPT). SPT consists of an online adaptation of the Nelder & Mead optimisation algorithm for hyper-parameter tuning. The method explores a dynamic size sample method to evaluate the current solution, and uses the Nelder & Mead operators to update the current set of parameters. The main contribution is the adaptation of the Nelder-Mead algorithm to automatically tune regression hyper-parameters for data streams. Additionally, whenever concept drifts occur in the data stream, it re-initiates the search for new hyper-parameters. The proposed method has been evaluated on regression scenario. Experiments with well known time-evolving data streams show that the proposed SPT hyper-parameter optimisation outperforms the results of previous expert hyper-parameter tuning efforts.

CloseRead Abstract