Publications

Publications by Ricardo Campos

2024

The 7<SUP>th</SUP> International Workshop on Narrative Extraction from Texts: Text2Story 2024

Authors
Campos, R; Jorge, A; Jatowt, A; Bhatia, S; Litvak, M;

Publication
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT V

Abstract
The Text2Story Workshop series, dedicated to Narrative Extraction from Texts, has been running successfully since 2018. Over the past six years, significant progress, largely propelled by Transformers and Large Language Models, has advanced our understanding of natural language text. Nevertheless, the representation, analysis, generation, and comprehensive identification of the different elements that compose a narrative structure remains a challenging objective. In its seventh edition, the workshop strives to consolidate a common platform and a multidisciplinary community for discussing and addressing various issues related to narrative extraction tasks. In particular, we aim to bring to the forefront the challenges involved in understanding narrative structures and integrating their representation into established frameworks, as well as in modern architectures (e.g., transformers) and AI-powered language models (e.g., chatGPT) which are now common and form the backbone of almost every IR and NLP application. Text2Story encompasses sessions covering full research papers, work-in-progress, demos, resources, position and dissemination papers, along with keynote talks. Moreover, there is dedicated space for informal discussions on methods, challenges, and the future of research in this dynamic field.

CloseRead Abstract

2023

Towards Timeline Generation with Abstract Meaning Representation

Authors
Mansouri, B; Campos, R; Jatowt, A;

Publication
COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023

Abstract
Timeline summarization (TLS) is a challenging research task that requires researchers to distill extensive and intricate temporal data into a concise and easily comprehensible representation. This paper proposes a novel approach to timeline summarization using Abstract Meaning Representations (AMRs), a graphical representation of the text where the nodes are semantic concepts and the edges denote relationships between concepts. With AMR, sentences with different wordings, but similar semantics, have similar representations. To make use of this feature for timeline summarization, a two-step sentence selection method that leverages features extracted from both AMRs and the text is proposed. First, AMRs are generated for each sentence. Sentences are then filtered out by removing those with no named-entities and keeping the ones with the highest number of named-entities. In the next step, sentences to appear in the timeline are selected based on two scores: Inverse Document Frequency (IDF) of AMR nodes combined with the score obtained by applying a keyword extraction method to the text. Our experimental results on the TLS-Covid19 test collection demonstrate the potential of the proposed approach.

CloseRead Abstract

2022

Diachronic Analysis of Time References in News Articles

Authors
Jatowt, A; Doucet, A; Campos, R;

Publication
WWW (Companion Volume)

Abstract
Time expressions embedded in text are important for many downstream tasks in NLP and IR. They have been, for example, utilized for timeline summarization, named entity recognition, temporal information retrieval, question answering and others. In this paper, we introduce a novel analytical approach to analyzing characteristics of time expressions in diachronic text collections. Based on a collection of news articles published over a 33-years' long time span, we investigate several aspects of time expressions with a focus on their interplay with publication dates of containing documents. We utilize a graph-based representation of temporal expressions to represent them through their co-occurring named entities. The proposed approach results in several observations that could be utilized in automatic systems that rely on processing temporal signals embedded in text. It could be also of importance for professionals (e.g., historians) who wish to understand fluctuations in collective memories and collective expectations based on large-scale, diachronic document collections.

CloseRead Abstract

2023

Report on the 1st Workshop on Implicit Author Characterization from Texts for Search and Retrieval (IACT 2023) at SIGIR 2023

Authors
Litvak, M; Rabaev, I; Campos, R; Jorge, AM; Jatowt, A;

Publication
SIGIR Forum

Abstract

2023

FALQU: Finding Answers to Legal Questions

Authors
Mansouri, B; Campos, R;

Publication
CoRR

Abstract

2023

AIIR and LIAAD Labs Systems for CLEF 2023 SimpleText

Authors
Mansouri, B; Durgin, S; Franklin, S; Fletcher, S; Campos, R;

Publication
CLEF (Working Notes)

Abstract
This paper describes the participation of the Artificial Intelligence and Information Retrieval (AIIR) Lab from the University of Southern Maine and the Laboratory of Artificial Intelligence and Decision Support (LIAAD) lab from INESC TEC in the CLEF 2023 SimpleText lab. There are three tasks defined for SimpleText: (T1) What is in (or out)?, (T2) What is unclear?, and (T3) Rewrite this!. Five runs were submitted for Task 1 using traditional Information Retrieval, and Sentence-BERT models. For Task 2, three runs were submitted, using YAKE! and KBIR keyword extraction models. Finally, for Task 3, two models were deployed, one using OpenAI Davinci embeddings and the other combining two unsupervised simplification models.

CloseRead Abstract