Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by HumanISE

2023

Optimization of Image Processing Algorithms for Character Recognition in Cultural Typewritten Documents

Authors
Dias, M; Lopes, CT;

Publication
ACM JOURNAL ON COMPUTING AND CULTURAL HERITAGE

Abstract
Linked data is used in various fields as a new way of structuring and connecting data. Cultural heritage institutions have been using linked data to improve archival descriptions and facilitate the discovery of information. Most archival records have digital representations of physical artifacts in the form of scanned images that are non-machine-readable. Optical Character Recognition (OCR) recognizes text in images and translates it into machine-encoded text. This article evaluates the impact of image processing methods and parameter tuning in OCR applied to typewritten cultural heritage documents. The approach uses a multi-objective problem formulation to minimize Levenshtein edit distance and maximize the number of words correctly identified with a non-dominated sorting genetic algorithm (NSGA-II) to tune the methods' parameters. Evaluation results show that parameterization by digital representation typology benefits the performance of image pre-processing algorithms in OCR. Furthermore, our findings suggest that employing image pre-processing algorithms in OCR might be more suitable for typologies where the text recognition task without pre-processing does not produce good results. In particular, Adaptive Thresholding, Bilateral Filter, and Opening are the best-performing algorithms for the theater plays' covers, letters, and overall dataset, respectively, and should be applied before OCR to improve its performance.

2023

Unveiling Archive Users: Understanding Their Characteristics and Motivations

Authors
Ponte, L; Koch, I; Lopes, CT;

Publication
LEVERAGING GENERATIVE INTELLIGENCE IN DIGITAL LIBRARIES: TOWARDS HUMAN-MACHINE COLLABORATION, ICADL 2023, PT II

Abstract
An institution must understand its users to provide quality services, and archives are no exception. Over the years, archives have adapted to the technological world, and their users have also changed. To understand archive users' characteristics and motivations, we conducted a study in the context of the Portuguese Archives. For this purpose, we analysed a survey and complemented this analysis with information gathered in interviews with archivists. Based on the most frequent reasons for visiting the archives, we defined six main archival profiles (genealogical research, historical research, legal purposes, academic work, institutional purposes and publication purposes), later characterised using the results of the previous analysis. For each profile, we created a persona for a more visual and realistic representation of users.

2023

Linking Theory and Practice of Digital Libraries: 27th International Conference on Theory and Practice of Digital Libraries, TPDL 2023, Zadar, Croatia, September 26-29, 2023, Proceedings

Authors
Alonso, O; Cousijn, H; Silvello, G; Marrero, M; Lopes, CT; Marchesin, S;

Publication
TPDL

Abstract

2023

Linking Theory and Practice of Digital Libraries

Authors
Alonso, O; Cousijn, H; Silvello, G; Marrero, M; Teixeira Lopes, C; Marchesin, S;

Publication
Lecture Notes in Computer Science

Abstract

2023

Chatbots Scenarios for Education

Authors
Virkus, S; Mamede, HS; Ramos Rocio, VJ; Dickel, J; Zubikova, O; Butkiene, R; Vaiciukynas, E; Ceponiene, L; Gudoniene, D;

Publication
Information and Software Technologies - 29th International Conference, ICIST 2023, Kaunas, Lithuania, October 12-14, 2023, Proceedings

Abstract
Educational chatbots are digital tools designed to assist learners in various educational settings. These chatbots use natural language processing (NLP) and machine learning algorithms to simulate human conversation and respond to user queries in a way that facilitates learning. They can be integrated into various educational platforms such as learning management systems, educational apps, and websites to provide learners with a personalized and interactive learning experience. Our paper discusses different scenarios for educational purposes and suggests in total four scenarios for educational needs.

2023

New resource-constrained project scheduling instances for testing (meta-)heuristic scheduling algorithms

Authors
Coelho, J; Vanhoucke, M;

Publication
COMPUTERS & OPERATIONS RESEARCH

Abstract
The resource-constrained project scheduling problem (RCPSP) is a well-known scheduling problem that has attracted attention since several decades. Despite the rapid progress of exact and (meta-)heuristic procedures, the problem can still not be solved to optimality for many problem instances of relatively small size. Due to the known complexity, many researchers have proposed fast and efficient meta-heuristic solution procedures that can solve the problem to near optimality. Despite the excellent results obtained in the last decades, little is known why some heuristics perform better than others. However, if researchers better understood why some meta-heuristic procedures generate good solutions for some project instances while still falling short for others, this could lead to insights to improve these meta-heuristics, ultimately leading to stronger algorithms and better overall solution quality. In this study, a new hardness indicator is proposed to measure the difficulty of providing near-optimal solutions for meta-heuristic procedures. The new indicator is based on a new concept that uses the o-distance metric to describe the solution space of the problem instance, and relies on current knowledge for lower and upper bound calculations for problem instances from five known datasets in the literature. This new indicator, which will be called the o -D indicator, will be used not only to measure the hardness of existing project datasets, but also to generate a new benchmark dataset that can be used for future research purposes. The new dataset contains project instances with different values for the o -D indicator, and it will be shown that the value of the o-distance metric actually describes the difficulty of the project instances through two fast and efficient meta-heuristic procedures from the literature.

  • 85
  • 667