Publicacoes - INESC TEC

Publicações

Publicações por Salisu Mamman Abdulrhaman

2015

Algorithm selection via meta-learning and sample-based active testing

Autores
Abdulrahman, SM; Brazdil, P; Van Rijn, JN; Vanschoren, J;

Publicação
CEUR Workshop Proceedings

Abstract
Identifying the best machine learning algorithm for a given problem continues to be an active area of research. In this paper we present a new method which exploits both meta-level information acquired in past experiments and active testing, an algorithm selection strategy. Active testing attempts to iteratively identify an algorithm whose performance will most likely exceed the performance of previously tried algorithms. The novel method described in this paper uses tests on smaller data sample to rank the most promising candidates, thus optimizing the schedule of experiments to be carried out. The experimental results show that this approach leads to considerably faster algorithm selection.

FecharLer Abstract

2015

Fast Algorithm Selection Using Learning Curves

Autores
van Rijn, JN; Abdulrahman, SM; Brazdil, P; Vanschoren, J;

Publicação
Advances in Intelligent Data Analysis XIV

Abstract
One of the challenges in Machine Learning to find a classifier and parameter settings that work well on a given dataset. Evaluating all possible combinations typically takes too much time, hence many solutions have been proposed that attempt to predict which classifiers are most promising to try. As the first recommended classifier is not always the correct choice, multiple recommendations should be made, making this a ranking problem rather than a classification problem. Even though this is a well studied problem, there is currently no good way of evaluating such rankings. We advocate the use of Loss Time Curves, as used in the optimization literature. These visualize the amount of budget (time) needed to converge to a acceptable solution. We also investigate a method that utilizes the measured performances of classifiers on small samples of data to make such recommendation, and adapt it so that it works well in Loss Time space. Experimental results show that this method converges extremely fast to an acceptable solution.

FecharLer Abstract

2014

Measures for Combining Accuracy and Time for Meta-learning

Autores
Abdulrahman, S; Brazdil, P;

Publicação
Proceedings of the International Workshop on Meta-learning and Algorithm Selection co-located with 21st European Conference on Artificial Intelligence, MetaSel@ECAI 2014, Prague, Czech Republic, August 19, 2014.

Abstract
The vast majority of studies in meta-learning uses only few performance measures when characterizing different machine learning algorithms. The measure Adjusted Ratios of Ratio (ARR) addresses the problem of how to evaluate the quality of a model based on the accuracy and training time. Unfortunately, this measure suffers from a shortcoming that is described in this paper. A new solution is proposed and it is shown that the proposed function satisfies the criterion of monotonicity, unlike ARR.

FecharLer Abstract

2016

Effect of Incomplete Meta-dataset on Average Ranking Method

Autores
Abdulrahman, SM; Brazdil, P;

Publicação
Proceedings of the 2016 Workshop on Automatic Machine Learning, AutoML 2016, co-located with 33rd International Conference on Machine Learning (ICML 2016), New York City, NY, USA, June 24, 2016

Abstract

2017

Combining Feature and Algorithm Hyperparameter Selection using some Metalearning Methods

Autores
Cachada, M; Abdulrahman, SM; Brazdil, P;

Publicação
Proceedings of the International Workshop on Automatic Selection, Configuration and Composition of Machine Learning Algorithms co-located with the European Conference on Machine Learning & Principles and Practice of Knowledge Discovery in Databases, AutoML@PKDD/ECML 2017, Skopje, Macedonia, September 22, 2017.

Abstract
Machine learning users need methods that can help them identify algorithms or even workflows (combination of algorithms with preprocessing tasks, using or not hyperparameter configurations that are different from the defaults), that achieve the potentially best performance. Our study was oriented towards average ranking (AR), an algorithm selection method that exploits meta-data obtained on prior datasets. We focused on extending the use of a variant of AR* that takes A3R as the relevant metric (combining accuracy and run time). The extension is made at the level of diversity of the portfolio of workflows that is made available to AR. Our aim was to establish whether feature selection and different hyperparameter configurations improve the process of identifying a good solution. To evaluate our proposal we have carried out extensive experiments in a leave-one-out mode. The results show that AR* was able to select workflows that are likely to lead to good results, especially when the portfolio is diverse. We additionally performed a comparison of AR* with Auto-WEKA, running with different time budgets. Our proposed method shows some advantage over Auto-WEKA, particularly when the time budgets are small.

FecharLer Abstract

2018

Speeding up algorithm selection using average ranking and active testing by introducing runtime

Autores
Abdulrahman, SM; Brazdil, P; van Rijn, JN; Vanschoren, J;

Publicação
MACHINE LEARNING

Abstract
Algorithm selection methods can be speeded-up substantially by incorporating multi-objective measures that give preference to algorithms that are both promising and fast to evaluate. In this paper, we introduce such a measure, A3R, and incorporate it into two algorithm selection techniques: average ranking and active testing. Average ranking combines algorithm rankings observed on prior datasets to identify the best algorithms for a new dataset. The aim of the second method is to iteratively select algorithms to be tested on the new dataset, learning from each new evaluation to intelligently select the next best candidate. We show how both methods can be upgraded to incorporate a multi-objective measure A3R that combines accuracy and runtime. It is necessary to establish the correct balance between accuracy and runtime, as otherwise time will be wasted by conducting less informative tests. The correct balance can be set by an appropriate parameter setting within function A3R that trades off accuracy and runtime. Our results demonstrate that the upgraded versions of Average Ranking and Active Testing lead to much better mean interval loss values than their accuracy-based counterparts.

FecharLer Abstract