2006
Autores
de Carvalho, FDAT; Brito, P; Bock, HH;
Publicação
COMPUTATIONAL STATISTICS
Abstract
This paper introduces a partitioning clustering method for objects described by interval data. It follows the dynamic clustering approach and uses an L-2 distance. Particular emphasis is put on the standardization problem where we propose and investigate three standardization techniques for interval-type variables. Moreover, various tools for cluster interpretation are presented and illustrated by simulated and real-case data.
2006
Autores
Duarte Silva, APD; Brito, P;
Publicação
COMPUTATIONAL STATISTICS
Abstract
This paper compares different approaches to the multivariate analysis of interval data, focusing on discriminant analysis. Three fundamental approaches are considered. The first approach assumes an uniform distribution in each observed interval, derives the corresponding measures of dispersion and association, and appropriately defines linear combinations of interval variables that maximize the usual discriminant criterion. The second approach expands the original data set into the set of all interval description vertices, and proceeds with a classical analysis of the expanded set. Finally, a third approach replaces each interval by a midpoint and range representation. Resulting representations, using intervals or single points, are discussed and distance based allocation rules are proposed. The three approaches are illustrated on a real data set.
2006
Autores
Brito, P; Noirhomme Fraiture, M;
Publicação
INTELLIGENT DATA ANALYSIS
Abstract
2006
Autores
Silva, HB; Brito, P; da Costa, JP;
Publicação
PATTERN RECOGNITION
Abstract
Applying graph theory to clustering, we propose a partitional clustering method and a clustering tendency index. No initial assumptions about the data set are requested by the method. The number of clusters and the partition that best fits the data set, are selected according to the optimal clustering tendency index value.
2006
Autores
Ghani, R; Soares, C;
Publicação
SIGKDD Explorations
Abstract
2006
Autores
Soares, C; Brazdil, PB;
Publicação
Proceedings of the ACM Symposium on Applied Computing
Abstract
The Support Vector Machine (SVM) algorithm is sensitive to the choice of parameter settings, which makes it hard to use by non-experts. It has been shown that meta-learning can be used to support the selection of SVM parameter values. Previous approaches have used general statistical measures as meta-features. Here we propose a new set of meta-features that are based on the kernel matrix. We test them on the problem of setting the width of the Gaussian kernel for regression problems. We obtain significant improvements in comparison to earlier meta-learning results. We expect that with better support in the selection of parameter values, SVM becomes accessible to a wider range of users. Copyright 2006 ACM.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.