Publicacoes - INESC TEC

Publicações

Publicações por LIAAD

2010

Metalearning

Autores
Fürnkranz, J; Chan, PK; Craw, S; Sammut, C; Uther, W; Ratnaparkhi, A; Jin, X; Han, J; Yang, Y; Morik, K; Dorigo, M; Birattari, M; Stützle, T; Brazdil, P; Vilalta, R; Giraud-Carrier, C; Soares, C; Rissanen, J; Baxter, RA; Bruha, I; Baxter, RA; Webb, GI; Torgo, L; Banerjee, A; Shan, H; Ray, S; Tadepalli, P; Shoham, Y; Powers, R; Shoham, Y; Powers, R; Webb, GI; Ray, S; Scott, S; Blockeel, H; De Raedt, L;

Publicação
Encyclopedia of Machine Learning

Abstract

2010

Combining meta-learning and search techniques to SVM parameter selection

Autores
Gomes, TAF; Prudencio, RBC; Soares, C; Rossi, ALD; Carvalho, A;

Publicação
Proceedings - 2010 11th Brazilian Symposium on Neural Networks, SBRN 2010

Abstract
Support Vector Machines (SVMs) have achieved very good performance on different learning problems. However, the success of SVMs depends on the adequate choice of a number of parameters, including for instance the kernel and the regularization parameters. In the current work, we propose the combination of Meta-Learning and search techniques to the problem of SVM parameter selection. Given an input problem, Meta-Learning is used to recommend SVM parameters based on well-succeeded parameters adopted in previous similar problems. The parameters returned by Meta-Learning are then used as initial search points to a search technique which will perform a further exploration of the parameter space. In this combination, we envisioned that the initial solutions provided by Meta-Learning are located in good regions in the search space (i.e. they are closer to the optimum solutions). Hence, the search technique would need to evaluate a lower number of candidate search points in order to find an adequate solution. In our work, we implemented a prototype in which Particle Swarm Optimization (PSO) was used to select the values of two SVM parameters for regression problems. In the performed experiments, the proposed solution was compared to a PSO with random initialization, obtaining better average results on a set of 40 regression problems. © 2010 IEEE.

FecharLer Abstract

2010

Using meta-learning to classify Traveling Salesman Problems

Autores
Kanda, J; Carvalho, A; Hruschka, E; Soares, C;

Publicação
Proceedings - 2010 11th Brazilian Symposium on Neural Networks, SBRN 2010

Abstract
In this paper, a meta-learning approach is proposed to suggest the best optimization technique(s) for instances of the Traveling Salesman Problem. The problem is represented by a dataset where each example is associated with one of the instances. Thus, each example contains characteristics of an instance and is labeled with the name of the technique(s) that obtained the best solution for this instance. Since the best solution can be obtained by more than one technique, an example may have more than one label. Therefore, the meta-learning problem is addressed as a multi-label classification problem. Experiments with 535 instances of the problem were performed to evaluate the proposed approach, which produced promising results. © 2010 IEEE.

FecharLer Abstract

2010

A comprehensive comparison of ML algorithms for gene expression data classification

Autores
de Souza, BF; de Carvalho, ACPLP; Soares, C;

Publicação
2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010

Abstract
Nowadays, microarray has become a fairly common tool for simultaneously inspecting the behavior of thousands of genes. Researchers have employed this technique to understand various biological phenomena. One straightforward use of such technology is identifying the class membership of the tissue samples based on their gene expression profiles. This task has been handled by a number of computational methods. In this paper, we provide a comprehensive evaluation of 7 commonly used algorithms over 6S publicly available gene expression datasets. The focus of the study was on comparing the performance of the algorithms in an efficient and sound manner, supporting the prospective users on how to proceed to choose the most adequate classification approach according to their investigation goals.

FecharLer Abstract

2010

Empirical evaluation of ranking prediction methods for gene expression data classification

Autores
De Souza, BF; De Carvalho, ACPLF; Soares, C;

Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
Recently, meta-learning techniques have been employed to the problem of algorithm recommendation for gene expression data classification. Due to their flexibility, the advice provided to the user was in the form of rankings, which are able to express a preference order of Machine Learning algorithms accordingly to their expected relative performance. Thus, choosing how to learn accurate rankings arises as a key research issue. In this work, the authors empirically evaluated 2 general approaches for ranking prediction and extended them. The results obtained for 49 publicly available microarray datasets indicate that the extensions introduced were very beneficial to the quality of the predicted rankings. © 2010 Springer-Verlag.

FecharLer Abstract

2010

Intelligent Document Routing as a First Step towards Workflow Automation: A Case Study Implemented in SQL

Autores
Soares, C; Calejo, M;

Publicação
LEVERAGING APPLICATIONS OF FORMAL METHODS, VERIFICATION, AND VALIDATION, PT I

Abstract
In large and complex organizations, the development of workflow automation projects is hard. In some cases, a first important step in that direction is the automation of the routing of incoming documents. In this paper, we describe a project to develop a system for the first routing of incoming letters to the right department within a large, public portuguese institution. We followed a data mining approach, where data representing previous routings were analyzed to obtain a model that can be used to route future documents. The approach followed was strongly influenced by some of the limitations imposed by the customer: the budget available was small and the solution should be developed in SQL to facilitate integration with the existing system. The system developed was able to obtain satisfactory results. However, as in any Data Mining project, most of the effort was dedicated to activities other than modelling (e.g., data preparation), which means that there is still plenty of room for improvement.

FecharLer Abstract