Paulo Jorge Azevedo

O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais

Instituição
Investigação
Domínios de Investigação
Inteligência Artificial

Bioengenharia

Comunicações

Ciência e Engenharia dos Computadores

Fotónica

Sistemas de Energia

Robótica

Engenharia e Gestão de Sistemas
CENTROS DE INVESTIGAÇÃO
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Inovação
Inovação / Tec4

TEC4AGRO-FOOD

TEC4ENERGY

TEC4HEALTH

TEC4INDUSTRY

TEC4SEA

TECPARTNERSHIPS

Tecnologias Disponíveis
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Laboratórios
Laboratórios de Investigação

iilab
Comunicação
Notícias

Eventos

Media

Boletim Informativo
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Junte-se a nós
Contactos

Home
Pessoas
Paulo Jorge Azevedo

Ler apresentação completa

Sou um professor auxiliar no departamento de informatica da Universidade do Minho. Sou membro do HASLab. A minha investigação concentra-se nas áreas de Machine Learning e Data Mining. Ocasionalmente, participio em projetcos de Bioinformática e.g. envolvendo análise de simulações de dinãmica molecular de desnaturação proteica. Tenho um doutoramento em Computação pelo Imperial College (Universdidade de Londres) onde fiz investigação em programação em lógica. Tenho vindo a desenvolver trabalho na área de regras de associação e respetivos algoritmos e em nos tipos de padrões para representar e capturar aprednizagem de distribuições. Tenho também interesse em analise de redes sociais, graph mining, análise de subgrupos (subgroup ming) e descoberta de motifs em series temporais.

Ler apresentação completa

Sobre

Tenho um doutoramento em Computação pelo Imperial College (Universdidade de Londres) onde fiz investigação em programação em lógica. Tenho vindo a desenvolver trabalho na área de regras de associação e respetivos algoritmos e em nos tipos de padrões para representar e capturar aprednizagem de distribuições. Tenho também interesse em analise de redes sociais, graph mining, análise de subgrupos (subgroup ming) e descoberta de motifs em series temporais.

Tópicos
de interesse

Detalhes

Nome
Paulo Jorge Azevedo
Cargo
Investigador Sénior
Desde
01 novembro 2011

Nacionalidade
Portugal
Centro
Laboratório de Software Confiável
Contactos
+351253604440
paulo.j.azevedo@inesctec.pt

003

Publicações

Ler todas as publicações

2025

Meta Subspace Analysis: Understanding Model (Mis)behavior in the Metafeature Space

Autores
Soares, C; Azevedo, PJ; Cerqueira, V; Torgor, L;

Publicação
DISCOVERY SCIENCE, DS 2025

Abstract
A subgroup discovery-based method has recently been proposed to understand the behavior of models in the (original) feature space. The subgroups identified represent areas of feature space where the model obtains better or worse predictive performance when compared to the average test performance. For instance, in the marketing domain, the approach extracts subgroups such as: in groups of customers with higher income and who are younger, the random forest achieves higher accuracy than on average. Here, we propose a complementary method, Meta Subspace Analysis (MSA), MSA uses metalearning to analyze these subgroups in the metafeature space. We use association rules to relate metafeatures of the feature space represented by the subgroups to the improvement or degradation of the performance of models. For instance, in the same domain, the approach extracts rules such as: when the class entropy decreases and the mutual information increases in the subgroup data, the random forest achieves lower accuracy. While the subgroups in the original feature space are useful for the end user and the data scientist developing the corresponding model, the meta-level rules provide a domain-independent perspective on the behavior of the model that is suitable for the same data scientist but also for ML researchers, to understand the behavior of algorithms. We illustrate the approach with the results of two well-known algorithms, naive Bayes and random forest, on the Adult dataset. The results confirm some expected behavior of algorithms. However, and most interestingly, some unexpected behaviors are also obtained, requiring additional investigation. In general, the empirical study demonstrates the usefulness of the approach to obtain additional knowledge about the behavior of models.

FecharLer Abstract

2023

Subgroup mining for performance analysis of regression models

Autores
Pimentel, J; Azevedo, PJ; Torgo, L;

Publicação
EXPERT SYSTEMS

Abstract
Machine learning algorithms have shown several advantages compared to humans, namely in terms of the scale of data that can be analysed, delivering high speed and precision. However, it is not always possible to understand how algorithms work. As a result of the complexity of some algorithms, users started to feel the need to ask for explanations, boosting the relevance of Explainable Artificial Intelligence. This field aims to explain and interpret models with the use of specific analytical methods that usually analyse how their predicted values and/or errors behave. While prediction analysis is widely studied, performance analysis has limitations for regression models. This paper proposes a rule-based approach, Error Distribution Rules (EDRs), to uncover atypical error regions, while considering multivariate feature interactions without size restrictions. Extracting EDRs is a form of subgroup mining. EDRs are model agnostic and a drill-down technique to evaluate regression models, which consider multivariate interactions between predictors. EDRs uncover regions of the input space with deviating performance providing an interpretable description of these regions. They can be regarded as a complementary tool to the standard reporting of the expected average predictive performance. Moreover, by providing interpretable descriptions of these specific regions, EDRs allow end users to understand the dangers of using regression tools for some specific cases that fall on these regions, that is, they improve the accountability of models. The performance of several models from different problems was studied, showing that our proposal allows the analysis of many situations and direct model comparison. In order to facilitate the examination of rules, two visualization tools based on boxplots and density plots were implemented. A network visualization tool is also provided to rapidly check interactions of every feature condition. An additional tool is provided by using a grid of boxplots, where comparison between quartiles of every distribution with a reference is performed. Based on this comparison, an extrapolation of counterfactual examples to regression was also implemented. A set of examples is described, including a setting where regression models performance is compared in detail using EDRs. Specifically, the error difference between two models in a dataset is studied by deriving rules highlighting regions of the input space where model performance difference is unexpected. The application of visual tools is illustrated using EDRs examples derived from public available datasets. Also, case studies illustrating the specialization of subgroups, identification of counter factual subgroups and detecting unanticipated complex models are presented. This paper extends the state of the art by providing a method to derive explanations for model performance instead of explanations for model predictions.

FecharLer Abstract

2023

Social network analytics and visualization: Dynamic topic-based influence analysis in evolving micro-blogs

Autores
Tabassum, S; Gama, J; Azevedo, PJ; Cordeiro, M; Martins, C; Martins, A;

Publicação
EXPERT SYSTEMS

Abstract
Influence Analysis is one of the well-known areas of Social Network Analysis. However, discovering influencers from micro-blog networks based on topics has gained recent popularity due to its specificity. Besides, these data networks are massive, continuous and evolving. Therefore, to address the above challenges we propose a dynamic framework for topic modelling and identifying influencers in the same process. It incorporates dynamic sampling, community detection and network statistics over graph data stream from a social media activity management application. Further, we compare the graph measures against each other empirically and observe that there is no evidence of correlation between the sets of users having large number of friends and the users whose posts achieve high acceptance (i.e., highly liked, commented and shared posts). Therefore, we propose a novel approach that incorporates a user's reachability and also acceptability by other users. Consequently, we improve on graph metrics by including a dynamic acceptance score (integrating content quality with network structure) for ranking influencers in micro-blogs. Additionally, we analysed the topic clusters' structure and quality with empirical experiments and visualization.

FecharLer Abstract

2020

Sequence Mining for Automatic Generation of Software Tests from GUI Event Traces

Autores
Oliveira, A; Freitas, R; Jorge, A; Amorim, V; Moniz, N; Paiva, ACR; Azevedo, PJ;

Publicação
IDEAL (2)

Abstract
In today’s software industry, systems are constantly changing. To maintain their quality and to prevent failures at controlled costs is a challenge. One way to foster quality is through thorough and systematic testing. Therefore, the definition of adequate tests is crucial for saving time, cost and effort. This paper presents a framework that generates software test cases automatically based on user interaction data. We propose a data-driven software test generation solution that combines the use of frequent sequence mining and Markov chain modeling. We assess the quality of the generated test cases by empirically evaluating their coverage with respect to observed user interactions and code. We also measure the plausibility of the distribution of the events in the generated test sets using the Kullback-Leibler divergence.

FecharLer Abstract

2019

Preference rules for label ranking: Mining patterns in multi-target relations

Autores
de Sá, CR; Azevedo, PJ; Soares, C; Jorge, AM; Knobbe, AJ;

Publicação
CoRR

Abstract

Paulo Jorge Azevedo

Sobre

Detalhes

Nome

Cargo

Desde

Nacionalidade

Centro

Contactos

SKORR

RUTE

SLSNA

Meta Subspace Analysis: Understanding Model (Mis)behavior in the Metafeature Space

Subgroup mining for performance analysis of regression models

Social network analytics and visualization: Dynamic topic-based influence analysis in evolving micro-blogs

Sequence Mining for Automatic Generation of Software Tests from GUI Event Traces

Preference rules for label ranking: Mining patterns in multi-target relations