Publicacoes - INESC TEC

Publicações

Publicações por Pavel Brazdil

2018

Incremental Sparse TFIDF & Incremental Similarity with Bipartite Graphs

Autores
Sarmento, RP; Brazdil, P;

Publicação
CoRR

Abstract

2022

Contextualization for the Organization of Text Documents Streams

Autores
Sarmento, RP; Cardoso, DdO; Gama, J; Brazdil, P;

Publicação
CoRR

Abstract

2018

Dynamic Laplace: Efficient Centrality Measure for Weighted or Unweighted Evolving Networks

Autores
Cordeiro, M; Sarmento, RP; Brazdil, P; Gama, J;

Publicação
CoRR

Abstract

2023

Exploring the Reduction of Configuration Spaces of Workflows

Autores
Freitas, F; Brazdil, P; Soares, C;

Publicação
Discovery Science - 26th International Conference, DS 2023, Porto, Portugal, October 9-11, 2023, Proceedings

Abstract
Many current AutoML platforms include a very large space of alternatives (the configuration space) that make it difficult to identify the best alternative for a given dataset. In this paper we explore a method that can reduce a large configuration space to a significantly smaller one and so help to reduce the search time for the potentially best workflow. We empirically validate the method on a set of workflows that include four ML algorithms (SVM, RF, LogR and LD) with different sets of hyperparameters. Our results show that it is possible to reduce the given space by more than one order of magnitude, from a few thousands to tens of workflows, while the risk that the best workflow is eliminated is nearly zero. The system after reduction is about one order of magnitude faster than the original one, but still maintains the same predictive accuracy and loss. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

FecharLer Abstract

2023

Combining Symbolic and Deep Learning Approaches for Sentiment Analysis

Autores
Muhammad, SH; Brazdil, P; Jorge, A;

Publicação
Compendium of Neurosymbolic Artificial Intelligence

Abstract
Deep learning approaches have become popular in sentiment analysis because of their competitive performance. The downside of this approach is that they do not provide understandable explanations on how the sentiment values are calculated. Previous approaches that used sentiment lexicons for sentiment analysis can do that, but their performance is lower than deep learning approaches. Therefore, it is natural to wonder if the two approaches can be combined to exploit their advantages. In this chapter, we present a neuro-symbolic approach that combines both symbolic and deep learning approaches for sentiment analysis tasks. The symbolic approach exploits sentiment lexicon and shifter patterns-which cover the operations of inversion/reversal, intensification, and attenuation/downtoning. The deep learning approach used a pre-trained language model (PLM) to construct sentiment lexicon. Our experimental result shows that the proposed approach leads to promising results, substantially better than the results of a pure lexicon-based approach. Although the results did not reach the level of the deep learning approach, a great advantage is that sentiment prediction can be accompanied by understandable explanations. For some users, it is very important to see how sentiment is derived, even if performance is a little lower. © 2023 The authors and IOS Press. All rights reserved.

FecharLer Abstract

2023

NLP-Crowdsourcing Hybrid Framework for Inter-Researcher Similarity Detection

Autores
Correia, A; Guimaraes, D; Paredes, H; Fonseca, B; Paulino, D; Trigo, L; Brazdil, P; Schneider, D; Grover, A; Jameel, S;

Publicação
IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS

Abstract
Visualizing and examining the intellectual landscape and evolution of scientific communities to support collaboration is crucial for multiple research purposes. In some cases, measuring similarities and matching patterns between research publication document sets can help to identify people with similar interests for building research collaboration networks and university-industry linkages. The premise of this work is assessing feasibility for resolving ambiguous cases in similarity detection to determine authorship with natural language processing (NLP) techniques so that crowdsourcing is applied only in instances that require human judgment. Using an NLP-crowdsourcing convergence strategy, we can reduce the costs of microtask crowdsourcing while saving time and maintaining disambiguation accuracy over large datasets. This article contributes a next-gen crowd-artificial intelligence framework that used an ensemble of term frequency-inverse document frequency and bidirectional encoder representation from transformers to obtain similarity rankings for pairs of scientific documents. A sequence of content-based similarity tasks was created using a crowd-powered interface for solving disambiguation problems. Our experimental results suggest that an adaptive NLP-crowdsourcing hybrid framework has advantages for inter-researcher similarity detection tasks where fully automatic algorithms provide unsatisfactory results, with the goal of helping researchers discover potential collaborators using data-driven approaches.

FecharLer Abstract