Publications

Publications by LIAAD

2023

AIIR and LIAAD Labs Systems for CLEF 2023 SimpleText

Authors
Mansouri, B; Durgin, S; Franklin, S; Fletcher, S; Campos, R;

Publication
CLEF (Working Notes)

Abstract
This paper describes the participation of the Artificial Intelligence and Information Retrieval (AIIR) Lab from the University of Southern Maine and the Laboratory of Artificial Intelligence and Decision Support (LIAAD) lab from INESC TEC in the CLEF 2023 SimpleText lab. There are three tasks defined for SimpleText: (T1) What is in (or out)?, (T2) What is unclear?, and (T3) Rewrite this!. Five runs were submitted for Task 1 using traditional Information Retrieval, and Sentence-BERT models. For Task 2, three runs were submitted, using YAKE! and KBIR keyword extraction models. Finally, for Task 3, two models were deployed, one using OpenAI Davinci embeddings and the other combining two unsupervised simplification models.

CloseRead Abstract

2023

Contrastive Keyword Extraction from Versioned Documents

Authors
Eder, L; Campos, R; Jatowt, A;

Publication
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023

Abstract
Versioned documents are common in many situations and play a vital part in numerous applications enabling an overview of the revisions made to a document or document collection. However, as documents increase in size, it gets difficult to summarize and comprehend all the changes made to versioned documents. In this paper, we propose a novel research problem of contrastive keyword extraction from versioned documents, and introduce an unsupervised approach that extracts keywords to reflect the key changes made to an earlier document version. In order to provide an easy-to-use comparison and summarization tool, an open-source demonstration is made available which can be found at https://contrastive-keyword-extraction.streamlit.app/.

CloseRead Abstract

2023

Preface

Authors
Litvak, M; Rabaev, I; Campos, R; Jorge, M; Jatowt, A;

Publication
CEUR Workshop Proceedings

Abstract
[No abstract available]

CloseRead Abstract

2023

Public News Archive: A Searchable Sub-archive to Portuguese Past News Articles

Authors
Campos, R; Correia, D; Jatowt, A;

Publication
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III

Abstract
Over the past fewdecades, the amount of information generated turned the Web into the largest knowledge infrastructure existing to date. Web archives have been at the forefront of data preservation, preventing the losses of significant data to humankind. Different snapshots of the web are saved everyday enabling users to surf the past web and to travel through this overtime. Despite these efforts, many people are not aware that the web is being preserved, often finding these infrastructures to be unattractive or difficult to use, when compared to common search engines. In this paper, we give a step towards making use of this preserved information to develop Public Archive an intuitive interface that enables end-users to search and analyze a large-scale of 67,242 past preserved news articles belonging to a Portuguese reference newspaper (Jornal Publico). The referred collection was obtained by scraping 10,976 versions of the homepage of the Jornal Publico preserved by the Portuguese web archive infrastructure (Arquivo.pt) during the time-period of 2010 to 2021. By doing this, we aim, not only to mark a stand in what respects to make use of this preserved information, but also to come up with an easy-to-follow solution, the Public Archive python package, which creates the roots to be used (with minor adaptations) by other news source providers interested in offering their readers access to past news articles.

CloseRead Abstract

2023

The selection of an optimal segmentation region in physiological signals

Authors
Oliveira, J; Carvalho, M; Nogueira, D; Coimbra, M;

Publication
INTERNATIONAL TRANSACTIONS IN OPERATIONAL RESEARCH

Abstract
Physiological signals are often corrupted by noisy sources. Usually, artificial intelligence algorithms analyze the whole signal, regardless of its varying quality. Instead, experienced cardiologists search for a high-quality signal segment, where more accurate conclusions can be draw. We propose a methodology that simultaneously selects the optimal processing region of a physiological signal and determines its decoding into a state sequence of physiologically meaningful events. Our approach comprises two phases. First, the training of a neural network that then enables the estimation of the state probability distribution of a signal sample. Second, the use of the neural network output within an integer program. The latter models the problem of finding a time window by maximizing a likelihood function defined by the user. Our method was tested and validated in two types of signals, the phonocardiogram and the electrocardiogram. In phonocardiogram and electrocardiogram segmentation tasks, the system's sensitivity increased on average from 95.1% to 97.5% and from 78.9% to 83.8%, respectively, when compared to standard approaches found in the literature.

CloseRead Abstract

2023

Feature Importances as a Tool for Root Cause Analysis in Time-Series Events

Authors
Kuk, M; Bobek, S; Veloso, B; Rajaoarisoa, LH; Nalepa, GJ;

Publication
ICCS (5)

Abstract
In an industrial setting, predicting the remaining useful life-time of equipment and systems is crucial for ensuring efficient operation, reducing downtime, and prolonging the life of costly assets. There are state-of-the-art machine learning methods supporting this task. However, in this paper, we argue, that both efficiency and understandability can be improved by the use of explainable AI methods that analyze the importance of features used by the machine learning model. In the paper, we analyze the feature importance before a failure occurs to identify events in which an increase in importance can be observed and based on that indicate attributes with the most influence on the failure. We demonstrate how the analyses of Shap values near the occurrence of failures can help identify the specific features that led to the failure. This in turn can help in identifying the root cause of the problem and developing strategies to prevent future failures. Additionally, it can be used to identify areas where maintenance or replacement is needed to prevent failure and prolong the useful life of a system.

CloseRead Abstract