2016
Autores
Forte, AC; Brazdil, PB;
Publicação
COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE (PROPOR 2016)
Abstract
We present a study in the area of sentiment analysis of clients' commentaries transcribed by assistants of a help-desk service of one Portuguese telecommunications company. We have adopted a lexicon-based approach to determine the polarity of the sentiment of each commentary, based on the so called opinion words. This task was by no means easy, as not many tools are available for the Portuguese language. The initial results with the off-the-shelf solutions were rather poor. This has motivated us to carry out a number of enhancements, including, for instance, enriching the given lexicon with domain specific terms, formulating specific rules for negation and amplifiers. Automatic pruning of some of the lexicon terms has led to a significant improvement in performance. As our final system achieved a very good performance, our work should be of interest to others working on domain specific solutions for languages where ready-made solutions are not available.
2017
Autores
Brazdil, P; Vilalta, R; Giraud Carrier, CG; Soares, C;
Publicação
Encyclopedia of Machine Learning and Data Mining
Abstract
In the area machine learning / data mining many diverse algorithms are available nowadays and hence the selection of the most suitable algorithm may be a challenge. Tbhis is aggravated by the fact that many algorithms require that certain parameters be set. If a wrong algorithm and/or parameter configuration is selected, substandard results may be obtained. The topic of metalearning aims to facilitate this task. Metalearning typically proceeds in two phases. First, a given set of algorithms A (e.g. classification algorithms) and datasets D is identified and different pairs < ai,dj > from these two sets are chosen for testing. The dataset di is described by certain meta-features which together with the performance result of algorithm ai constitute a part of the metadata. In the second phase the metadata is used to construct a model, usually again with recourse to machine learning methods. The model represents a generalization of various base-level experiments. The model can then be applied to the new dataset to recommend the most suitable algorithm or a ranking ordered by relative performance. This article provides more details about this area. Besides, it discusses also how the method can be combined with hyperparameter optimization and extended to sequences of operations (workflows). © Springer Science+Business Media New York 2011, 2017
2015
Autores
Vanschoren, J; Brazdil, P; Carrier, CGG; Kotthoff, L;
Publicação
MetaSel@PKDD/ECML
Abstract
2017
Autores
Brazdil, P; Vanschoren, J; Hutter, F; Hoos, H;
Publicação
AutoML@PKDD/ECML
Abstract
2017
Autores
Souza Roza, R; Brazdil, P; Reis, JL; Cerdeira, A; Martins, P; Felgueiras, O;
Publicação
Atas da Conferencia da Associacao Portuguesa de Sistemas de Informacao
Abstract
The combination of information obtained from data mining technique from physicochemical and organoleptic data analysis allowed similarities between the wines of the nine sub-regions in the Demarcated Region of Vinho Verde. Through clustering techniques, four clusters were identified, each characterized by its centroid. The measure of information gain, together with supervised rule-based learning, was used to find the differentiating characteristics. This study allowed the interconnection of the characteristics of the wines of these sub-regions, which can improve the decision making on the profiles of these same wines.
2018
Autores
Abdulrahman, SM; Cachada, MV; Brazdil, P;
Publicação
VIPIMAGE 2017
Abstract
Selecting appropriate classification algorithms for a given dataset is crucial and useful in practice but is also full of challenges. In order to maximize performance, users of machine learning algorithms need methods that can help them identify the most relevant features in datasets, select algorithms and determine their appropriate hyperparameter settings. In this paper, a method of recommending classification algorithms is proposed. It is oriented towards the average ranking method, combining algorithm rankings observed on prior datasets to identify the best algorithms for a new dataset. Our method uses a special case of data mining workflow that combines algorithm selection preceded by a feature selection method (CFS).
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.