Publicacoes - INESC TEC

Publicações

Publicações por LIAAD

2006

Dynamic clustering for interval data based on L-2 distance

Autores
de Carvalho, FDAT; Brito, P; Bock, HH;

Publicação
COMPUTATIONAL STATISTICS

Abstract
This paper introduces a partitioning clustering method for objects described by interval data. It follows the dynamic clustering approach and uses an L-2 distance. Particular emphasis is put on the standardization problem where we propose and investigate three standardization techniques for interval-type variables. Moreover, various tools for cluster interpretation are presented and illustrated by simulated and real-case data.

FecharLer Abstract

2006

Linear discriminant analysis for interval data

Autores
Duarte Silva, APD; Brito, P;

Publicação
COMPUTATIONAL STATISTICS

Abstract
This paper compares different approaches to the multivariate analysis of interval data, focusing on discriminant analysis. Three fundamental approaches are considered. The first approach assumes an uniform distribution in each observed interval, derives the corresponding measures of dispersion and association, and appropriately defines linear combinations of interval variables that maximize the usual discriminant criterion. The second approach expands the original data set into the set of all interval description vertices, and proceeds with a classical analysis of the expanded set. Finally, a third approach replaces each interval by a midpoint and range representation. Resulting representations, using intervals or single points, are discussed and distance based allocation rules are proposed. The three approaches are illustrated on a real data set.

FecharLer Abstract

2006

Symbolic and spatial data analysis: Mining complex data structures

Autores
Brito, P; Noirhomme Fraiture, M;

Publicação
INTELLIGENT DATA ANALYSIS

Abstract

2006

A partitional clustering algorithm validated by a clustering tendency index based on graph theory

Autores
Silva, HB; Brito, P; da Costa, JP;

Publicação
PATTERN RECOGNITION

Abstract
Applying graph theory to clustering, we propose a partitional clustering method and a clustering tendency index. No initial assumptions about the data set are requested by the method. The number of clusters and the partition that best fits the data set, are selected according to the optimal clustering tendency index value.

FecharLer Abstract

2006

Data mining for business applications: KDD-2006 workshop

Autores
Ghani, R; Soares, C;

Publicação
SIGKDD Explorations

Abstract

2006

Selecting parameters of SVM using meta-learning and kernel matrix-based meta-features

Autores
Soares, C; Brazdil, PB;

Publicação
Proceedings of the ACM Symposium on Applied Computing

Abstract
The Support Vector Machine (SVM) algorithm is sensitive to the choice of parameter settings, which makes it hard to use by non-experts. It has been shown that meta-learning can be used to support the selection of SVM parameter values. Previous approaches have used general statistical measures as meta-features. Here we propose a new set of meta-features that are based on the kernel matrix. We test them on the problem of setting the width of the Gaussian kernel for regression problems. We obtain significant improvements in comparison to earlier meta-learning results. We expect that with better support in the selection of parameter values, SVM becomes accessible to a wider range of users. Copyright 2006 ACM.

FecharLer Abstract