Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
About
Download Photo HD

About

I am Associate Professor at the School of Economics of the University of Porto, where  I teach Statistics and Multivariate Data Analysis, at undergraduate and post-graduate (Master, PhD) levels, and member of the Artificial Intelligence and Decision Support Lab (LIAAD) of INESC-TEC. I hold a doctorate degree in Applied Mathematics from the University of Paris Dauphine (1991).

My current research focuses on the analysis of multidimensional complex data, known as symbolic data - data representing inherent variability, in the form of intervals or distributions - for which I develop statistical approaches and multivariate analysis methodologies.  I am generally interested in multivariate data analysis, with particular incidence in clustering methods.

Interest
Topics
Details

Details

  • Name

    Paula Brito
  • Cluster

    Computer Science
  • Role

    Senior Researcher
  • Since

    01st January 2008
001
Publications

2021

Discriminant analysis of distributional data via fractional programming

Authors
Dias, S; Brito, P; Amaral, P;

Publication
European Journal of Operational Research

Abstract
We address classification of distributional data, where units are described by histogram or interval-valued variables. The proposed approach uses a linear discriminant function where distributions or intervals are represented by quantile functions, under specific assumptions. This discriminant function allows defining a score for each unit, in the form of a quantile function, which is used to classify the units in two a priori groups, using the Mallows distance. There is a diversity of application areas for the proposed linear discriminant method. In this work we classify the airline companies operating in NY airports based on air time and arrival/departure delays, using a full year flights. © 2021 Elsevier B.V.

2021

A test to compare interval time series

Authors
Maharaj, EA; Brito, P; Teles, P;

Publication
International Journal of Approximate Reasoning

Abstract

2020

Clustering genomic words in human DNA using peaks and trends of distributions

Authors
Tavares, AH; Raymaekers, J; Rousseeuw, PJ; Brito, P; Afreixo, V;

Publication
Advances in Data Analysis and Classification

Abstract
In this work we seek clusters of genomic words in human DNA by studying their inter-word lag distributions. Due to the particularly spiked nature of these histograms, a clustering procedure is proposed that first decomposes each distribution into a baseline and a peak distribution. An outlier-robust fitting method is used to estimate the baseline distribution (the ‘trend’), and a sparse vector of detrended data captures the peak structure. A simulation study demonstrates the effectiveness of the clustering procedure in grouping distributions with similar peak behavior and/or baseline features. The procedure is applied to investigate similarities between the distribution patterns of genomic words of lengths 3 and 5 in the human genome. These experiments demonstrate the potential of the new method for identifying words with similar distance patterns. © 2019, The Author(s).

2020

New contributions for the comparison of community detection algorithms in attributed networks

Authors
Vieira, AR; Campos, P; Brito, P;

Publication
JOURNAL OF COMPLEX NETWORKS

Abstract
Community detection techniques use only the information about the network topology to find communities in networks Similarly, classic clustering techniques for vector data consider only the information about the values of the attributes describing the objects to find clusters. In real-world networks, however, in addition to the information about the network topology, usually there is information about the attributes describing the vertices that can also be used to find communities. Using both the information about the network topology and about the attributes describing the vertices can improve the algorithms' results. Therefore, authors started investigating methods for community detection in attributed networks. In the past years, several methods were proposed to uncover this task, partitioning a graph into sub-graphs of vertices that are densely connected and similar in terms of their descriptions. This article focuses on the analysis and comparison of some of the proposed methods for community detection in attributed networks. For that purpose, several applications to both synthetic and real networks are conducted. Experiments are performed on both weighted and unweighted graphs. The objective is to establish which methods perform generally better according to the validation measures and to investigate their sensitivity to changes in the networks' structure and homogeneity.

2019

Clustering of interval time series

Authors
Maharaj, EA; Teles, P; Brito, P;

Publication
Statistics and Computing

Abstract
Interval time series occur when real intervals of some variable of interest are registered as an ordered sequence along time. We address the problem of clustering interval time series (ITS), for which different approaches are proposed. First, clustering is performed based on point-to-point comparisons. Time-domain and wavelet features also serve as clustering variables in alternative approaches. Furthermore, autocorrelation matrix functions, gathering the autocorrelation and cross-correlation functions of the ITS upper and lower bounds, may be compared using adequate distances (e.g. the Frobenius distance) and used for clustering ITS. An improved procedure to determine the autocorrelation function of ITS is proposed, which also serves as a basis for clustering. The different alternative approaches are explored and their performances compared for ITS simulated under different setups. An application to sea level daily ranges, observed at different locations in Australia, illustrates the proposed methods. © 2019, Springer Science+Business Media, LLC, part of Springer Nature.

Supervised
thesis

2020

Clusterwise Linear Regression for Interval Data - An Extension of Interval Distributional Model

Author
Nikhil Koppara Suresh

Institution
UP-FEP

2020

Interval-Weighted Networks: Community Detection and Centrality Measures

Author
Hélder Fernando Cerqueira Alves

Institution
UP-FCUP

2019

Analysis of inter genomic word distance distributions

Author
Ana Helena Marques de Pinho Tavares

Institution
UP-FCUP

2019

Community Detection in Attributed Networks: An Application to Socioeconomic Data from European Union

Author
Ana Rita Cordeiro Vieira

Institution
UP-FEP

2019

Blockmodeling: New Developments and Application to Social Networks

Author
Hélder Fernando Cerqueira Alves

Institution
UP-FCUP