2007
Autores
Brito, P;
Publicação
ADVANCES IN DATA ANALYSIS
Abstract
In this paper we discuss some issues which arise when applying classical data analysis techniques to interval data, focusing on the notions of dispersion, association and linear combinations of interval variables. We present some methods that have been proposed for analysing this kind of data, namely for clustering, discriminant analysis, linear regression and interval time series analysis.
1995
Autores
BRITO, P;
Publicação
ANNALS OF OPERATIONS RESEARCH
Abstract
We recall a formalism based on the notion of symbolic object (Diday [15], Brito and Diday [8]), which allows to generalize the classical tabular model of Data Analysis. We study assertion objects, a particular class of symbolic objects which is endowed with a partial order and a quasi-order. Operations are then defined on symbolic objects. We study the property of completeness, already considered in Brito and Diday [8], which expresses the duality extension/intension. We formalize this notion in the framework of the theory of Galois connections and study the order structure of complete assertion objects. We introduce the notion of c-connection, as being a pair of mappings (f, g) between two partially ordered sets which should fulfil given conditions. A complete assertion object is then defined as a fixed point of the composed f o g; this mapping is called a ''completeness operator'' for it ''completes'' a given assertion object. The set of complete assertion objects forms a lattice and we state how suprema and infima are obtained. The lattice structure being too complex to allow a clustering study of a data set, we have proposed a pyramidal clustering approach [8]. The symbolic pyramidal clustering method builds a pyramid bottom-up, each cluster being described by a complete assertion object whose extension is the cluster itself. We thus obtain an inheritance structure on the data set. The inheritance structure then leads to the generation of rules.
2012
Autores
Brito, P; Pedro Duarte Silva, APD;
Publicação
JOURNAL OF APPLIED STATISTICS
Abstract
A parametric modelling for interval data is proposed, assuming a multivariate Normal or Skew-Normal distribution for the midpoints and log-ranges of the interval variables. The intrinsic nature of the interval variables leads to special structures of the variance-covariance matrix, which is represented by five different possible configurations. Maximum likelihood estimation for both models under all considered configurations is studied. The proposed modelling is then considered in the context of analysis of variance and multivariate analysis of variance testing. To access the behaviour of the proposed methodology, a simulation study is performed. The results show that, for medium or large sample sizes, tests have good power and their true significance level approaches nominal levels when the constraints assumed for the model are respected; however, for small samples, sizes close to nominal levels cannot be guaranteed. Applications to Chinese meteorological data in three different regions and to credit card usage variables for different card designations, illustrate the proposed methodology.
2007
Autores
Santos, LD; Martins, I; Brito, P;
Publicação
Applied Research in Quality of Life
Abstract
The evaluation of the urban quality of life has been an important aspect of the research concerning the contemporary city and an increasingly support to urban planning and management. As part of a project to monitor the quality of life in the city of Porto, a survey of the resident population was conducted in order to study the citizens' perceptions of their local quality of life and its evolution in recent years. The opinions of individuals on their level of satisfaction with various fields of the urban quality of life are systematised, as well as their integrated assessment. This analysis is complemented by a multivariate analysis that allows the grouping of the interviewees in large homogenous groups and their social and economic characterisation. Based on the results achieved, we try to highlight the usefulness of the qualitative analysis of the quality of life to support the definition of urban policies. © 2007 Springer Science + Business Media BV/The International Society for Quality-of-Life Studies (ISQOLS).
2003
Autores
Brito, P; de Carvalho, FAT;
Publicação
EXPLORATORY DATA ANALYSIS IN EMPIRICAL RESEARCH, PROCEEDINGS
Abstract
In previous work (Brito and De Carvalho (1999)) we have considered the presence of dependence rules between variables in the framework of a symbolic clustering method. In another paper Brito (1998) has addressed the problem of clustering probabilistic data. The aim of this paper is to bring together the two issues, that is, to take into account dependence rules on probabilistic data. This is accomplished by introducing new generality measures with an appropriate generalization operator. This approach allows for the extension of a symbolic clustering. method to constrained probabilistic data.
2024
Autores
Alves, H; Brito, P; Campos, P;
Publicação
DATA MINING AND KNOWLEDGE DISCOVERY
Abstract
In this paper we introduce and develop the concept of interval-weighted networks (IWN), a novel approach in Social Network Analysis, where the edge weights are represented by closed intervals composed with precise information, comprehending intrinsic variability. We extend IWN for both Newman's modularity and modularity gain and the Louvain algorithm, considering a tabular representation of networks by contingency tables. We apply our methodology to two real-world IWN. The first is a commuter network in mainland Portugal, between the twenty three NUTS 3 Regions (IWCN). The second focuses on annual merchandise trade between 28 European countries, from 2003 to 2015 (IWTN). The optimal partition of geographic locations (regions or countries) is developed and compared using two new different approaches, designated as Classic Louvain and Hybrid Louvain , which allow taking into account the variability observed in the original network, thereby minimizing the loss of information present in the raw data. Our findings suggest the division of the twenty three Portuguese regions in three main communities for the IWCN and between two to three country communities for the IWTN. However, we find different geographical partitions according to the community detection methodology used. This analysis can be useful in many real-world applications, since it takes into account that the weights may vary within the ranges, rather than being constant.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.