2023
Autores
Mansouri, B; Campos, R; Jatowt, A;
Publicação
COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023
Abstract
Timeline summarization (TLS) is a challenging research task that requires researchers to distill extensive and intricate temporal data into a concise and easily comprehensible representation. This paper proposes a novel approach to timeline summarization using Abstract Meaning Representations (AMRs), a graphical representation of the text where the nodes are semantic concepts and the edges denote relationships between concepts. With AMR, sentences with different wordings, but similar semantics, have similar representations. To make use of this feature for timeline summarization, a two-step sentence selection method that leverages features extracted from both AMRs and the text is proposed. First, AMRs are generated for each sentence. Sentences are then filtered out by removing those with no named-entities and keeping the ones with the highest number of named-entities. In the next step, sentences to appear in the timeline are selected based on two scores: Inverse Document Frequency (IDF) of AMR nodes combined with the score obtained by applying a keyword extraction method to the text. Our experimental results on the TLS-Covid19 test collection demonstrate the potential of the proposed approach.
2022
Autores
Jatowt, A; Doucet, A; Campos, R;
Publicação
Companion of The Web Conference 2022, Virtual Event / Lyon, France, April 25 - 29, 2022
Abstract
Time expressions embedded in text are important for many downstream tasks in NLP and IR. They have been, for example, utilized for timeline summarization, named entity recognition, temporal information retrieval, question answering and others. In this paper, we introduce a novel analytical approach to analyzing characteristics of time expressions in diachronic text collections. Based on a collection of news articles published over a 33-years' long time span, we investigate several aspects of time expressions with a focus on their interplay with publication dates of containing documents. We utilize a graph-based representation of temporal expressions to represent them through their co-occurring named entities. The proposed approach results in several observations that could be utilized in automatic systems that rely on processing temporal signals embedded in text. It could be also of importance for professionals (e.g., historians) who wish to understand fluctuations in collective memories and collective expectations based on large-scale, diachronic document collections. © 2022 ACM.
2016
Autores
Alvarez, MM; Kruschwitz, U; Kazai, G; Hopfgartner, F; Corney, DPA; Campos, R; Albakour, D;
Publicação
SIGIR Forum
Abstract
2023
Autores
Litvak, M; Rabaev, I; Campos, R; Jorge, AM; Jatowt, A;
Publicação
SIGIR Forum
Abstract
2023
Autores
Mansouri, B; Campos, R;
Publicação
CoRR
Abstract
2023
Autores
Mansouri, B; Durgin, S; Franklin, S; Fletcher, S; Campos, R;
Publicação
Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), Thessaloniki, Greece, September 18th to 21st, 2023.
Abstract
This paper describes the participation of the Artificial Intelligence and Information Retrieval (AIIR) Lab from the University of Southern Maine and the Laboratory of Artificial Intelligence and Decision Support (LIAAD) lab from INESC TEC in the CLEF 2023 SimpleText lab. There are three tasks defined for SimpleText: (T1) What is in (or out)?, (T2) What is unclear?, and (T3) Rewrite this!. Five runs were submitted for Task 1 using traditional Information Retrieval, and Sentence-BERT models. For Task 2, three runs were submitted, using YAKE! and KBIR keyword extraction models. Finally, for Task 3, two models were deployed, one using OpenAI Davinci embeddings and the other combining two unsupervised simplification models.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.