Publicacoes - INESC TEC

Publicações

Publicações por Paulo Jorge Azevedo

2008

A methodology for exploring association models

Autores
Jorge, A; Pocas, J; Azevedo, PJ;

Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
Visualization in data mining is typically related to data exploration. In this chapter we present a methodology for the post processing and visualization of association rule models. One aim is to provide the user with a tool that enables the exploration of a large set of association rules. The method is inspired by the hypertext metaphor. The initial set of rules is dynamically divided into small comprehensible sets or pages, according to the interest of the user. From each set, the user can move to other sets by choosing one appropriate operator. The set of available operators transform sets of rules into sets of rules, allowing focusing on interesting regions of the rule space. Each set of rules can also be then seen with different graphical representations. The tool is web-based and dynamically generates SVG pages to represent graphics. Association rules are given in PMML format. © 2008 Springer-Verlag Berlin Heidelberg.

FecharLer Abstract

2005

An experiment with association rules and classification: Post-bagging and conviction

Autores
Jorge, AM; Azevedo, PJ;

Publicação
DISCOVERY SCIENCE, PROCEEDINGS

Abstract
In this paper we study a new technique we call post-bagging, which consists in resampling parts of a classification model rather then the data. We do this with a particular kind of model: large sets of classification association rules, and in combination with ordinary best rule and weighted voting approaches. We empirically evaluate the effects of the technique in terms of classification accuracy. We also discuss the predictive power of different metrics used for association rule mining, such as confidence, lift, conviction and chi(2). We conclude that, for the described experimental conditions, post-bagging improves classification results and that the best metric is conviction.

FecharLer Abstract

2004

Model-based collaborative filtering for team building support

Autores
Veloso, M; Jorge, A; Azevedo, PJ;

Publicação
ICEIS 2004 - Proceedings of the Sixth International Conference on Enterprise Information Systems

Abstract
In this paper we describe an application of recommender systems to team building in a company or organization. The recommender system uses a collaborative filtering model based approach. Recommender models are sets of association rules extracted from the activity log of employees assigned to projects or tasks. Recommendation is performed at two levels: first by recommending a single team element given a partially built team; and second by recommending changes to a completed team. The methodology is applied to a case study with real data. The results are evaluated through experimental tests and one survey to potential users.

FecharLer Abstract

2023

Subgroup mining for performance analysis of regression models

Autores
Pimentel, J; Azevedo, PJ; Torgo, L;

Publicação
EXPERT SYSTEMS

Abstract
Machine learning algorithms have shown several advantages compared to humans, namely in terms of the scale of data that can be analysed, delivering high speed and precision. However, it is not always possible to understand how algorithms work. As a result of the complexity of some algorithms, users started to feel the need to ask for explanations, boosting the relevance of Explainable Artificial Intelligence. This field aims to explain and interpret models with the use of specific analytical methods that usually analyse how their predicted values and/or errors behave. While prediction analysis is widely studied, performance analysis has limitations for regression models. This paper proposes a rule-based approach, Error Distribution Rules (EDRs), to uncover atypical error regions, while considering multivariate feature interactions without size restrictions. Extracting EDRs is a form of subgroup mining. EDRs are model agnostic and a drill-down technique to evaluate regression models, which consider multivariate interactions between predictors. EDRs uncover regions of the input space with deviating performance providing an interpretable description of these regions. They can be regarded as a complementary tool to the standard reporting of the expected average predictive performance. Moreover, by providing interpretable descriptions of these specific regions, EDRs allow end users to understand the dangers of using regression tools for some specific cases that fall on these regions, that is, they improve the accountability of models. The performance of several models from different problems was studied, showing that our proposal allows the analysis of many situations and direct model comparison. In order to facilitate the examination of rules, two visualization tools based on boxplots and density plots were implemented. A network visualization tool is also provided to rapidly check interactions of every feature condition. An additional tool is provided by using a grid of boxplots, where comparison between quartiles of every distribution with a reference is performed. Based on this comparison, an extrapolation of counterfactual examples to regression was also implemented. A set of examples is described, including a setting where regression models performance is compared in detail using EDRs. Specifically, the error difference between two models in a dataset is studied by deriving rules highlighting regions of the input space where model performance difference is unexpected. The application of visual tools is illustrated using EDRs examples derived from public available datasets. Also, case studies illustrating the specialization of subgroups, identification of counter factual subgroups and detecting unanticipated complex models are presented. This paper extends the state of the art by providing a method to derive explanations for model performance instead of explanations for model predictions.

FecharLer Abstract

2023

Social network analytics and visualization: Dynamic topic-based influence analysis in evolving micro-blogs

Autores
Tabassum, S; Gama, J; Azevedo, PJ; Cordeiro, M; Martins, C; Martins, A;

Publicação
EXPERT SYSTEMS

Abstract
Influence Analysis is one of the well-known areas of Social Network Analysis. However, discovering influencers from micro-blog networks based on topics has gained recent popularity due to its specificity. Besides, these data networks are massive, continuous and evolving. Therefore, to address the above challenges we propose a dynamic framework for topic modelling and identifying influencers in the same process. It incorporates dynamic sampling, community detection and network statistics over graph data stream from a social media activity management application. Further, we compare the graph measures against each other empirically and observe that there is no evidence of correlation between the sets of users having large number of friends and the users whose posts achieve high acceptance (i.e., highly liked, commented and shared posts). Therefore, we propose a novel approach that incorporates a user's reachability and also acceptability by other users. Consequently, we improve on graph metrics by including a dynamic acceptance score (integrating content quality with network structure) for ranking influencers in micro-blogs. Additionally, we analysed the topic clusters' structure and quality with empirical experiments and visualization.

FecharLer Abstract

2019

Preference rules for label ranking: Mining patterns in multi-target relations

Autores
de Sá, CR; Azevedo, PJ; Soares, C; Jorge, AM; Knobbe, AJ;

Publicação
CoRR

Abstract