Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Sobre

Sobre

Pavel Brazdil is a founder of a strong Machine Learning / Data Mining group that exists since 1988 and which now is a part of LIAAD Inesc Tec (Laboratory of AI and Decision Support). Pavel Brazdil is Full Professor (Prof. Catedrático) at the Faculty of Economics (FEP) of University of Porto, where he has been teaching courses on Information systems, Data Mining and Text Mining. He has supervised 12 PhD students. Although he has officially retired in mid-July 2015, he continues his R&D activities, including teaching at Master and Doctoral courses and supervision of post-graduate students.

Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    Pavel Brazdil
  • Cluster

    Informática
  • Cargo

    Investigador Coordenador
  • Desde

    01 janeiro 2010
001
Publicações

2023

AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages

Autores
Muhammad, SH; Abdulmumin, I; Ayele, AA; Ousidhoum, N; Adelani, DI; Yimam, SM; Ahmad, IS; Beloucif, M; Mohammad, S; Ruder, S; Hourrane, O; Brazdil, P; António Ali, FDM; David, D; Osei, S; Bello, BS; Ibrahim, F; Gwadabe, T; Rutunda, S; Belay, TD; Messelle, WB; Balcha, HB; Chala, SA; Gebremichael, HT; Opoku, B; Arthur, S;

Publicação
CoRR

Abstract

2022

Detection of Loanwords in Angolan Portuguese: A Text Mining Approach

Autores
Muhongo, TS; Brazdil, PB; Silva, F;

Publicação
INTELIGENCIA ARTIFICIAL-IBEROAMERICAL JOURNAL OF ARTIFICIAL INTELLIGENCE

Abstract
Angola is characterized by many different languages and social, cultural and political realities, which had a marked effect on Angolan Portuguese (AP). Consequently, AP is characterized by diatopic variation. One of the marked effects is the loanwords imported from other Angolan languages. Our objective is to analyze different Angolan texts, analyze the lexical forms used and conduct a comparative study with European Portuguese, aiming at identifying the possible loanwords in Angolan Portuguese. This process was automated, as well as the identification of all loanwords' cotexts. In addition, we determine the lexical class of each loanword and the Angolan language of its origin. Most lexical loanwords come from the Kimbundu, although AP includes loanwords from some other Angolan languages too. Our study serves as a basis for preparing an Angolan regionalism dictionary. We noticed that more than 700 identified loanwords do not figure in the existing dictionaries.

2022

Semi-Automatic Approaches for Exploiting Shifter Patterns in Domain-Specific Sentiment Analysis

Autores
Brazdil, P; Muhammad, SH; Oliveira, F; Cordeiro, J; Silva, F; Silvano, P; Leal, A;

Publicação
MATHEMATICS

Abstract
This paper describes two different approaches to sentiment analysis. The first is a form of symbolic approach that exploits a sentiment lexicon together with a set of shifter patterns and rules. The sentiment lexicon includes single words (unigrams) and is developed automatically by exploiting labeled examples. The shifter patterns include intensification, attenuation/downtoning and inversion/reversal and are developed manually. The second approach exploits a deep neural network, which uses a pre-trained language model. Both approaches were applied to texts on economics and finance domains from newspapers in European Portuguese. We show that the symbolic approach achieves virtually the same performance as the deep neural network. In addition, the symbolic approach provides understandable explanations, and the acquired knowledge can be communicated to others. We release the shifter patterns to motivate future research in this direction.

2022

Metalearning

Autores
Brazdil, P; van Rijn, JN; Soares, C; Vanschoren, J;

Publicação
Cognitive Technologies

Abstract

2022

NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis

Autores
Muhammad, SH; Adelani, DI; Ruder, S; Ahmad, IS; Abdulmumin, I; Bello, BS; Choudhury, M; Emezue, CC; Abdullahi, SS; Aremu, A; Jorge, A; Brazdil, P;

Publicação
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION

Abstract
Sentiment analysis is one of the most widely studied applications in NLP, but most work focuses on languages with large amounts of data. We introduce the first large-scale human-annotated Twitter sentiment dataset for the four most widely spoken languages in Nigeria-Hausa, Igbo, Nigerian-Pidgin, and Yoruba-consisting of around 30,000 annotated tweets per language, including a significant fraction of code-mixed tweets. We propose text collection, filtering, processing, and labeling methods that enable us to create datasets for these low-resource languages. We evaluate a range of pre-trained models and transfer strategies on the dataset. We find that language-specific models and language-adaptive fine-tuning generally perform best. We release the datasets, trained models, sentiment lexicons, and code to incentivize research on sentiment analysis in under-represented languages.

Teses
supervisionadas

2017

Workflow Recommendation for Text Classification Problems

Autor
Maria João Fernandes Ferreira

Instituição
UP-FCUP

2017

Automatic Recommendation of Machine Learning Workflows

Autor
Miguel Alexandre Viana Cachada

Instituição
UP-FEP

2017

Improving Algorithm Selection Methods using Meta-Learning by Considering Accuracy and Run Time

Autor
Salisu Mamman Abdulrahman

Instituição
UP-FEP

2017

Identifying Affinity Groups of Researchers in FEP through the Application of Community Detection Algorithms

Autor
André Martinez Candeias Lima

Instituição
UP-FEP

2015

Development of a support system for workflow design for data mining problems that exploits Meta-learning

Autor
Salisu Mamman Abdulrahman

Instituição
UP-FEP