2023
Authors
Martins AC.; Correia, Flora; Bruno M P M Oliveira;
Publication
Abstract
2023
Authors
Nakamura, Ingrid; Oliveira, Andreia.; Warkentin, Sarah.; Bruno M P M Oliveira; Poínhos, Rui;
Publication
Abstract
2023
Authors
Nova, Lúcia; Poínhos, Rui; Bruno M P M Oliveira; Rocha, Ada; Afonso, Cláudia;
Publication
Abstract
2023
Authors
Lucas, A.; Sacchetti, Francisca; Silva, Sara; Poínhos, Rui; Bruno M P M Oliveira; Rocha, Ada; Afonso, Cláudia;
Publication
Abstract
2023
Authors
Castro Maria; Bruno M P M Oliveira; Afonso, Cláudia;
Publication
Abstract
2023
Authors
Campos, R; Correia, D; Jatowt, A;
Publication
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III
Abstract
Over the past fewdecades, the amount of information generated turned the Web into the largest knowledge infrastructure existing to date. Web archives have been at the forefront of data preservation, preventing the losses of significant data to humankind. Different snapshots of the web are saved everyday enabling users to surf the past web and to travel through this overtime. Despite these efforts, many people are not aware that the web is being preserved, often finding these infrastructures to be unattractive or difficult to use, when compared to common search engines. In this paper, we give a step towards making use of this preserved information to develop Public Archive an intuitive interface that enables end-users to search and analyze a large-scale of 67,242 past preserved news articles belonging to a Portuguese reference newspaper (Jornal Publico). The referred collection was obtained by scraping 10,976 versions of the homepage of the Jornal Publico preserved by the Portuguese web archive infrastructure (Arquivo.pt) during the time-period of 2010 to 2021. By doing this, we aim, not only to mark a stand in what respects to make use of this preserved information, but also to come up with an easy-to-follow solution, the Public Archive python package, which creates the roots to be used (with minor adaptations) by other news source providers interested in offering their readers access to past news articles.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.