Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Pavel Brazdil

2022

Metalearning

Authors
Brazdil, P; van Rijn, JN; Soares, C; Vanschoren, J;

Publication
Cognitive Technologies

Abstract

2022

NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis

Authors
Muhammad, SH; Adelani, DI; Ruder, S; Ahmad, IS; Abdulmumin, I; Bello, BS; Choudhury, M; Emezue, CC; Abdullahi, SS; Aremu, A; Jorge, A; Brazdil, P;

Publication
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION

Abstract
Sentiment analysis is one of the most widely studied applications in NLP, but most work focuses on languages with large amounts of data. We introduce the first large-scale human-annotated Twitter sentiment dataset for the four most widely spoken languages in Nigeria-Hausa, Igbo, Nigerian-Pidgin, and Yoruba-consisting of around 30,000 annotated tweets per language, including a significant fraction of code-mixed tweets. We propose text collection, filtering, processing, and labeling methods that enable us to create datasets for these low-resource languages. We evaluate a range of pre-trained models and transfer strategies on the dataset. We find that language-specific models and language-adaptive fine-tuning generally perform best. We release the datasets, trained models, sentiment lexicons, and code to incentivize research on sentiment analysis in under-represented languages.

2021

Extending General Sentiment Lexicon to Specific Domains in (Semi-)Automatic Manner

Authors
Brazdil P.; Silvano P.; Silva F.; Muhammad S.; Oliveira F.; Cordeiro J.; Leal A.;

Publication
CEUR Workshop Proceedings

Abstract
This paper describes an approach to the construction of a sentiment analysis system that uses both automatic and manual processes. The system includes a domain-specific sentiment lexicon, modifier patterns and rules that are used to derive the sentiment values of sentences in new texts. The lexicon that includes single words (unigrams) is obtained in an automatic manner from the distribution of ratings for all words in the labelled training data. The sentiment values of phrases is derived from a list of modifier patterns, built/developed manually. These include a modifier and a focal element. The modifiers can be of different types, depending on whether the operation is intensification, downtoning or reversal. This approach was applied to texts on economics and finance in European Portuguese. In our view, this line of work deserves more attention in the community, as the system not only has reasonable performance, but also can provide understandable explanations to the user.

2025

Reducing algorithm configuration spaces for efficient search

Authors
Freitas, F; Brazdil, P; Soares, C;

Publication
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS

Abstract
Many current AutoML platforms include a very large space of alternatives (the configuration space). This increases the probability of including the best one for any dataset but makes the task of identifying it for a new dataset more difficult. In this paper, we explore a method that can reduce a large configuration space to a significantly smaller one and so help to reduce the search time for the potentially best algorithm configuration, with limited risk of significant loss of predictive performance. We empirically validate the method with a large set of alternatives based on five ML algorithms with different sets of hyperparameters and one preprocessing method (feature selection). Our results show that it is possible to reduce the given search space by more than one order of magnitude, from a few thousands to a few hundred items. After reduction, the search for the best algorithm configuration is about one order of magnitude faster than on the original space without significant loss in predictive performance.

2022

Advances in Metalearning: ECML/PKDD Workshop on Meta-Knowledge Transfer

Authors
Brazdil, P; van Rijn, JN; Gouk, H; Mohr, F;

Publication
ECML/PKDD Workshop on Meta-Knowledge Transfer, 23 September 2022, Grenoble, France

Abstract

2022

ECML/PKDD Workshop on Meta-Knowledge Transfer, 23 September 2022, Grenoble, France

Authors
Brazdil, P; van Rijn, JN; Gouk, H; Mohr, F;

Publication
Meta-Knowledge Transfer @ ECML/PKDD

Abstract

  • 7
  • 21