Publications

Publications by João Gama

2023

Social network analytics and visualization: Dynamic topic-based influence analysis in evolving micro-blogs

Authors
Tabassum, S; Gama, J; Azevedo, PJ; Cordeiro, M; Martins, C; Martins, A;

Publication
EXPERT SYSTEMS

Abstract
Influence Analysis is one of the well-known areas of Social Network Analysis. However, discovering influencers from micro-blog networks based on topics has gained recent popularity due to its specificity. Besides, these data networks are massive, continuous and evolving. Therefore, to address the above challenges we propose a dynamic framework for topic modelling and identifying influencers in the same process. It incorporates dynamic sampling, community detection and network statistics over graph data stream from a social media activity management application. Further, we compare the graph measures against each other empirically and observe that there is no evidence of correlation between the sets of users having large number of friends and the users whose posts achieve high acceptance (i.e., highly liked, commented and shared posts). Therefore, we propose a novel approach that incorporates a user's reachability and also acceptability by other users. Consequently, we improve on graph metrics by including a dynamic acceptance score (integrating content quality with network structure) for ranking influencers in micro-blogs. Additionally, we analysed the topic clusters' structure and quality with empirical experiments and visualization.

CloseRead Abstract

2021

How can I choose an explainer?: An Application-grounded Evaluation of Post-hoc Explanations

Authors
Jesus, SM; Belém, CG; Balayan, V; Bento, J; Saleiro, P; Bizarro, P; Gama, J;

Publication
FAccT

Abstract
There have been several research works proposing new Explainable AI (XAI) methods designed to generate model explanations having specific properties, or desiderata, such as fidelity, robustness, or human-interpretability. However, explanations are seldom evaluated based on their true practical impact on decision-making tasks. Without that assessment, explanations might be chosen that, in fact, hurt the overall performance of the combined system of ML model + end-users. This study aims to bridge this gap by proposing XAI Test, an application-grounded evaluation methodology tailored to isolate the impact of providing the end-user with different levels of information. We conducted an experiment following XAI Test to evaluate three popular XAI methods - LIME, SHAP, and TreeInterpreter - on a real-world fraud detection task, with real data, a deployed ML model, and fraud analysts. During the experiment, we gradually increased the information provided to the fraud analysts in three stages: Data Only, i.e., just transaction data without access to model score nor explanations, Data + ML Model Score, and Data + ML Model Score + Explanations. Using strong statistical analysis, we show that, in general, these popular explainers have a worse impact than desired. Some of the conclusion highlights include: i) showing Data Only results in the highest decision accuracy and the slowest decision time among all variants tested, ii) all the explainers improve accuracy over the Data + ML Model Score variant but still result in lower accuracy when compared with Data Only; iii) LIME was the least preferred by users, probably due to its substantially lower variability of explanations from case to case.

CloseRead Abstract

2022

An Algorithm Adaptation Method for Multi-Label Stream Classification using Self-Organizing Maps

Authors
Cerri, R; Costa Júnior, JD; Faria Paiva, ERd; da Gama, JMP;

Publication
2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA

Abstract
Multi-label stream classification is the task of classifying instances in two or more classes simultaneously, with instances flowing continuously in high speed. This task imposes difficult challenges, such as the detection of concept drifts, where the distributions of the instances in the stream change with time, and infinitely delayed labels, when the ground truth labels of the instances are never available to help updating the classifiers. To solve such task, the methods from the literature use the problem transformation approach, which divides the multi-label problem into different sub-problems, associating one classification model for each class. In this paper, we propose a method based on self-organizing maps that, different from the literature, uses only one model to deal with all classes simultaneously. By using the algorithm adaptation approach, our proposal better considers label dependencies, improving the results over its counterparts. Experiments using different synthetic and real-world datasets showed that our proposal obtained the overall best performance when compared to different methods from the literature.

CloseRead Abstract

2019

Contextual One-Class Classification in Data Streams

Authors
Moulton, RH; Viktor, HL; Japkowicz, N; Gama, J;

Publication
CoRR

Abstract

2018

Dynamic Laplace: Efficient Centrality Measure for Weighted or Unweighted Evolving Networks

Authors
Cordeiro, M; Sarmento, RP; Brazdil, P; Gama, J;

Publication
CoRR

Abstract

2016

SimTensor: A synthetic tensor data generator

Authors
T, HF; Gama, J;

Publication
CoRR

Abstract