Publications

Publications by Alípio Jorge

2020

Sequence Mining for Automatic Generation of Software Tests from GUI Event Traces

Authors
Oliveira, A; Freitas, R; Jorge, A; Amorim, V; Moniz, N; Paiva, ACR; Azevedo, PJ;

Publication
IDEAL (2)

Abstract
In today’s software industry, systems are constantly changing. To maintain their quality and to prevent failures at controlled costs is a challenge. One way to foster quality is through thorough and systematic testing. Therefore, the definition of adequate tests is crucial for saving time, cost and effort. This paper presents a framework that generates software test cases automatically based on user interaction data. We propose a data-driven software test generation solution that combines the use of frequent sequence mining and Markov chain modeling. We assess the quality of the generated test cases by empirically evaluating their coverage with respect to observed user interactions and code. We also measure the plausibility of the distribution of the events in the generated test sets using the Kullback-Leibler divergence.

CloseRead Abstract

2020

Preface

Authors
Jorge, AM; Campos, R; Jatowt, A; Aizawa, A;

Publication
CEUR Workshop Proceedings

Abstract

2020

Proceedings of AI4Narratives - Workshop on Artificial Intelligence for Narratives in conjunction with the 29th International Joint Conference on Artificial Intelligence and the 17th Pacific Rim International Conference on Artificial Intelligence (IJCAI 2020), Yokohama, Japan, January 7th and 8th, 2021 (online event due to Covid-19 outbreak)

Authors
Jorge, AM; Campos, R; Jatowt, A; Aizawa, A;

Publication
AI4Narratives@IJCAI

Abstract

2021

Time-Matters: Temporal Unfolding of Texts

Authors
Campos, R; Duque, J; Cândido, T; Mendes, J; Dias, G; Jorge, A; Nunes, C;

Publication
ECIR (2)

Abstract
Over the past few years, the amount of information generated, consumed and stored on the Web has grown exponentially, making it impossible for users to keep up to date. Temporal data representation can help in this process by giving documents a sense of organization. Timelines are a natural way to showcase this data, giving users the chance to get familiar with a topic in a shorter amount of time. Despite their importance, little is known about their use in the context of single documents. In this paper, we present Time-Matters, a novel system to automatically explore arbitrary texts through temporal narratives in an interactive fashion that allows users to get insights into the relevant temporal happenings of a story through multiple components, including temporal annotation, storylines or temporal clustering. In contrast to classical timeline multi-document summarization tasks, we focus on performing text summaries of single documents with a temporal lens. This approach may be of interest to a number of providers such as media outlets, for which automatically building a condensed overview of a text is an important issue.

CloseRead Abstract

2021

TLS-Covid19: A New Annotated Corpus for Timeline Summarization

Authors
Pasquali, A; Campos, R; Ribeiro, A; Santana, BS; Jorge, A; Jatowt, A;

Publication
ECIR (1)

Abstract
The rise of social media and the explosion of digital news in the web sphere have created new challenges to extract knowledge and make sense of published information. Automated timeline generation appears in this context as a promising answer to help users dealing with this information overload problem. Formally, Timeline Summarization (TLS) can be defined as a subtask of Multi-Document Summarization (MDS) conceived to highlight the most important information during the development of a story over time by summarizing long-lasting events in a timely ordered fashion. As opposed to traditional MDS, TLS has a limited number of publicly available datasets. In this paper, we propose TLS-Covid19 dataset, a novel corpus for the Portuguese and English languages. Our aim is to provide a new, larger and multi-lingual TLS annotated dataset that could foster timeline summarization evaluation research and, at the same time, enable the study of news coverage about the COVID-19 pandemic. TLS-Covid19 consists of 178 curated topics related to the COVID-19 outbreak, with associated news articles covering almost the entire year of 2020 and their respective reference timelines as gold-standard. As a final outcome, we conduct an experimental study on the proposed dataset over two extreme baseline methods. All the resources are publicly available at https://github.com/LIAAD/tls-covid19.

CloseRead Abstract

2021

The 4International Workshop on Narrative Extraction from Texts: Text2Story 2021. th

Authors
Campos, R; Jorge, A; Jatowt, A; Bhatia, S; Finlayson, MA;

Publication
ECIR (2)

Abstract
Narrative extraction, understanding and visualization is currently a popular topic and an important tool for humans interested in achieving a deeper understanding of text. Information Retrieval (IR), Natural Language Processing (NLP) and Machine Learning (ML) already offer many instruments that aid the exploration of narrative elements in text and within unstructured data. Despite evident advances in the last couple of years the problem of automatically representing narratives in a structured form, beyond the conventional identification of common events, entities and their relationships, is yet to be solved. This workshop held virtually onApril 1^st, 2021 co-located with the 43^rd European Conference on Information Retrieval (ECIR’21) aims at presenting and discussing current and future directions for IR, NLP, ML and other computational fields capable of improving the automatic understanding of narratives. It includes a session devoted to regular, short and demo papers, keynote talks and space for an informal discussion of the methods, of the challenges and of the future of the area.

CloseRead Abstract