Publicacoes - INESC TEC

Publicações

Publicações por LIAAD

2023

Report on the 1st Workshop on Implicit Author Characterization from Texts for Search and Retrieval (IACT 2023) at SIGIR 2023

Autores
Litvak, M; Rabaev, I; Campos, R; Jorge, AM; Jatowt, A;

Publicação
SIGIR Forum

Abstract

2023

Tweet2Story: Extracting Narratives from Twitter

Autores
Campos, V; Campos, R; Jorge, A;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I

Abstract
Topics discussed on social media platforms contain a disparate amount of information written in colloquial language, making it difficult to understand the narrative of the topic. In this paper, we take a step forward, towards the resolution of this problem by proposing a framework that performs the automatic extraction of narratives from a document, such as tweet posts. To this regard, we propose a methodology that extracts information from the texts through a pipeline of tasks, such as co-reference resolution and the extraction of entity relations. The result of this process is embedded into an annotation file to be used by subsequent operations, such as visualization schemas. We named this framework Tweet2Story and measured its effectiveness under an evaluation schema that involved three different aspects: (i) as an Open Information extraction (OpenIE) task, (ii) by comparing the narratives of manually annotated news articles linked to tweets about the same topic and (iii) by comparing their knowledge graphs, produced by the narratives, in a qualitative way. The results obtained show a high precision and a moderate recall, on par with other OpenIE state-of-the-art frameworks and confirm that the narratives can be extracted from small texts. Furthermore, we show that the narrative can be visualized in an easily understandable way.

FecharLer Abstract

2023

Event Extraction for Portuguese: A QA-Driven Approach Using ACE-2005

Autores
Cunha, LF; Campos, R; Jorge, A;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I

Abstract
Event extraction is an Information Retrieval task that commonly consists of identifying the central word for the event (trigger) and the event's arguments. This task has been extensively studied for English but lags behind for Portuguese, partly due to the lack of task-specific annotated corpora. This paper proposes a framework in which two separated BERT-based models were fine-tuned to identify and classify events in Portuguese documents. We decompose this task into two sub-tasks. Firstly, we use a token classification model to detect event triggers. To extract event arguments, we train a Question Answering model that queries the triggers about their corresponding event argument roles. Given the lack of event annotated corpora in Portuguese, we translated the original version of the ACE-2005 dataset (a reference in the field) into Portuguese, producing a new corpus for Portuguese event extraction. To accomplish this, we developed an automatic translation pipeline. Our framework obtains F1 marks of 64.4 for trigger classification and 46.7 for argument classification setting, thus a new state of the art reference for these tasks in Portuguese.

FecharLer Abstract

2023

Symbolic Versus Deep Learning Techniques for Explainable Sentiment Analysis

Autores
Muhammad, SH; Brazdil, P; Jorge, A;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I

Abstract
Deep learning approaches have become popular in many different areas, including sentiment analysis (SA), because of their competitive performance. However, the downside of this approach is that they do not provide understandable explanations on how the sentiment values are calculated. In contrast, previous approaches that used sentiment lexicons can do that, but their performance is normally not high. To leverage the strengths of both approaches, we present a neuro-symbolic approach that combines deep learning (DL) and symbolic methods for SA tasks. The DL approach uses a pre-trained language model (PLM) to construct sentiment lexicon. The symbolic approach exploits the constructed sentiment lexicon and manually constructed shifter patterns to determine the sentiment of a sentence. Our experimental results show that the proposed approach leads to promising results with the additional advantage that sentiment predictions can be accompanied by understandable explanations.

FecharLer Abstract

2023

Combining Neighbor Models to Improve Predictions of Age of Onset of ATTRv Carriers

Autores
Pedroto, M; Jorge, A; Mendes Moreira, J; Coelho, T;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT II

Abstract
Transthyretin (TTR)-related familial amyloid polyneuropathy (ATTRv) is a life-threatening autosomal dominant disease and the age of onset represents the moment when first symptoms are felt. Accurately predicting the age of onset for a given patient is relevant for risk assessment and treatment management. In this work, we evaluate the impact of combining prediction models obtained from neighboring time windows on prediction error. We propose Symmetric (Sym) and Asymmetric (Asym) models which represent two different averaging approaches. These are incorporated with a weighting mechanism as to create Symmetric (Sym), Symmetric-weighted (Sym-w), Asymmetric (Asym), and Asymmetric-weighted (Asym-w). These four ensemble models are then compared to the original approach which is focused on individual regression base learners namely: Baseline (BL), Decision Tree (DT), Elastic Net (EN), Lasso (LA), Linear Regression (LR), Random Forest (RF), Ridge (RI), Support Vector Regressor (SV) and XGBoost (XG). Our results show that by aggregating predictions from neighbor models the average mean absolute error obtained by each base learner decreases. Overall, the best results are achieved by regression-based ensemble tree models as base learners.

FecharLer Abstract

2023

Report on the 6th International Workshop on Narrative Extraction from Texts (Text2Story 2023) at ECIR 2023

Autores
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S; Litvak, M; Cordeiro, JP; Rocha, C; Sousa, HO; Mansouri, B;

Publicação
SIGIR Forum

Abstract
The Sixth International Workshop on Narrative Extraction from Texts (Text2Story'23) was held on April 2 nd , 2023, in conjunction with the 45 th European Conference on Information Retrieval (ECIR 2023) in Dublin, Ireland. Continuing the tradition of past years, the workshop was held as a hybrid event. Online participation was allowed using the Zoom platform. During the course of the day, more than 50 attendees had the opportunity to follow up and discuss the recent advances in topics related to representation, extraction, and generation of narratives. The workshop program included two invited keynotes and nineteen paper presentations. The proceedings of the workshop are available online 1 . Date: 2 April 2023. Website: https://text2story23.inesctec.pt/.

FecharLer Abstract