Publications

Publications by LIAAD

2006

Dynamic clustering for interval data based on L-2 distance

Authors
de Carvalho, FDAT; Brito, P; Bock, HH;

Publication
COMPUTATIONAL STATISTICS

Abstract
This paper introduces a partitioning clustering method for objects described by interval data. It follows the dynamic clustering approach and uses an L-2 distance. Particular emphasis is put on the standardization problem where we propose and investigate three standardization techniques for interval-type variables. Moreover, various tools for cluster interpretation are presented and illustrated by simulated and real-case data.

CloseRead Abstract

2006

Linear discriminant analysis for interval data

Authors
Duarte Silva, APD; Brito, P;

Publication
COMPUTATIONAL STATISTICS

Abstract
This paper compares different approaches to the multivariate analysis of interval data, focusing on discriminant analysis. Three fundamental approaches are considered. The first approach assumes an uniform distribution in each observed interval, derives the corresponding measures of dispersion and association, and appropriately defines linear combinations of interval variables that maximize the usual discriminant criterion. The second approach expands the original data set into the set of all interval description vertices, and proceeds with a classical analysis of the expanded set. Finally, a third approach replaces each interval by a midpoint and range representation. Resulting representations, using intervals or single points, are discussed and distance based allocation rules are proposed. The three approaches are illustrated on a real data set.

CloseRead Abstract

2006

Symbolic and spatial data analysis: Mining complex data structures

Authors
Brito, P; Noirhomme Fraiture, M;

Publication
INTELLIGENT DATA ANALYSIS

Abstract

2006

A partitional clustering algorithm validated by a clustering tendency index based on graph theory

Authors
Silva, HB; Brito, P; da Costa, JP;

Publication
PATTERN RECOGNITION

Abstract
Applying graph theory to clustering, we propose a partitional clustering method and a clustering tendency index. No initial assumptions about the data set are requested by the method. The number of clusters and the partition that best fits the data set, are selected according to the optimal clustering tendency index value.

CloseRead Abstract

2006

Data mining for business applications: KDD-2006 workshop

Authors
Ghani, R; Soares, C;

Publication
SIGKDD Explorations

Abstract

2006

Selecting parameters of SVM using meta-learning and kernel matrix-based meta-features

Authors
Soares, C; Brazdil, PB;

Publication
Proceedings of the ACM Symposium on Applied Computing

Abstract
The Support Vector Machine (SVM) algorithm is sensitive to the choice of parameter settings, which makes it hard to use by non-experts. It has been shown that meta-learning can be used to support the selection of SVM parameter values. Previous approaches have used general statistical measures as meta-features. Here we propose a new set of meta-features that are based on the kernel matrix. We test them on the problem of setting the width of the Gaussian kernel for regression problems. We obtain significant improvements in comparison to earlier meta-learning results. We expect that with better support in the selection of parameter values, SVM becomes accessible to a wider range of users. Copyright 2006 ACM.

CloseRead Abstract