1998
Authors
Gama, J; Torgo, L; Soares, C;
Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE-IBERAMIA 98
Abstract
Discretization of continuous attributes is an important task for certain types of machine learning algorithms. Bayesian approaches, for instance, require assumptions about data distributions. Decision Trees on the other hand, require sorting operations to deal with continuous attributes, which largely increase learning times. This paper presents a new method of discretization, whose main characteristic is that it takes into account interdependencies between attributes. Detecting interdependencies can be seen as discovering redundant attributes. This means that our method performs attribute selection as a side effect of the discretization. Empirical evaluation on five benchmark datasets from UCI repository, using C4.5 and a naive Bayes, shows a consistent reduction of the features without loss of generalization accuracy.
2006
Authors
Ghani, R; Soares, C;
Publication
SIGKDD Explorations
Abstract
2011
Authors
Prudencio, RBC; Soares, C; Ludermir, TB;
Publication
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2011, PT II
Abstract
Several meta-learning approaches have been developed for the problem of algorithm selection. In this context, it is of central importance to collect a sufficient number of datasets to be used as meta-examples in order to provide reliable results. Recently, some proposals to generate datasets have addressed this issue with successful results. These proposals include datasetoids, which is a simple manipulation method to obtain new datasets from existing ones. However, the increase in the number of datasets raises another issue: in order to generate meta-examples for training, it is necessary to estimate the performance of the algorithms on the datasets. This typically requires running all candidate algorithms on all datasets, which is computationally very expensive. One approach to address this problem is the use of an active learning approach to meta-learning, termed active meta-learning. In this paper we investigate the combined use of an active meta-learning approach based on an uncertainty score and datasetoids. Based on our results, we conclude that the accuracy of our method is very good results with as little as 10% to 20% of the meta-examples labeled.
2010
Authors
Utgoff, PE; Cussens, J; Kramer, S; Jain, S; Stephan, F; Raedt, LD; Todorovski, L; Flener, P; Schmid, U; Vilalta, R; Giraud-Carrier, C; Brazdil, P; Soares, C; Keogh, E; Smart, WD; Abbeel, P; Ng, AY;
Publication
Encyclopedia of Machine Learning
Abstract
2010
Authors
Fürnkranz, J; Chan, PK; Craw, S; Sammut, C; Uther, W; Ratnaparkhi, A; Jin, X; Han, J; Yang, Y; Morik, K; Dorigo, M; Birattari, M; Stützle, T; Brazdil, P; Vilalta, R; Giraud-Carrier, C; Soares, C; Rissanen, J; Baxter, RA; Bruha, I; Baxter, RA; Webb, GI; Torgo, L; Banerjee, A; Shan, H; Ray, S; Tadepalli, P; Shoham, Y; Powers, R; Shoham, Y; Powers, R; Webb, GI; Ray, S; Scott, S; Blockeel, H; De Raedt, L;
Publication
Encyclopedia of Machine Learning
Abstract
2002
Authors
Peng, YH; Flach, PA; Soares, C; Brazdil, P;
Publication
DISCOVERY SCIENCE, PROCEEDINGS
Abstract
This paper presents new measures, based on the induced decision tree, to characterise datasets for meta-learning in order to select appropriate learning algorithms. The main idea is to capture the characteristics of dataset from the structural shape and size of decision tree induced from the dataset. Totally 15 measures are proposed to describe the structure of a decision tree. Their effectiveness is illustrated through extensive experiments, by comparing to the results obtained by the existing data characteristics techniques, including data characteristics tool (DCT) that is the most wide used technique in meta-learning, and Landmarking that is the most recently developed method.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.