2022
Autores
Rodrigues, J; Lopes, CT;
Publicação
RESEARCH CHALLENGES IN INFORMATION SCIENCE
Abstract
Research data management (RDM) practices are critical for ensuring research success. Data can assume diverse formats and data in image format have been understudied in RDM. To understand image management habits in research, we have conducted semi-structured interviews with researchers from four research domains. Most researchers do not formally manage their images, nor do they develop RDM plans. They assume that image management is not a topic discussed at project meetings. In turn, they tend to perform some individual practices, depending on the context and their own opinion, such as creating captions to describe the images and organizing and storing the images in specific locations. However, they see these habits as necessary and admit that they will start to do so in a formal and collaborative way with the working group. These results provide valuable information on practical aspects of the use and production of images in research.
2022
Autores
Lopes, CT; Azevedo, D; Monteiro, JM;
Publicação
2022 17TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI)
Abstract
- A patient's ability to recall and retrieve health information contributes to a better health management. HealthTalks was developed to address these issues by recording a summary of a medical appointment, uttered by the physician, and transcribing it. For each appointment, the user can also take free-text notes. Nowadays, search engines have become a ubiquitous part of everyone's life and are expected on most applications. Here, we describe the development of a search engine for HealthTalks. The app's characteristics demand a lightweight and offline engine, which requires a specific solution rather than an existing library or service. Our approach combines SQLite's Full-Text Search 4 module, which includes ngram indexing, with traditional information retrieval techniques to rank the documents. We created a test collection with summaries of clinical appointments (our documents), information needs, search queries, and relevance assessments for an initial search engine evaluation. Using this test collection, we assessed performance using NDCG@10, the first rank position of a totally relevant result, and query latency. Results are promising, with an average NDCG of 0.97. The median rank position of the first relevant result varies between 1.9 and 1.95, depending on the use of 4-gram character tokenization, an aspect that did not significantly affect the results. We expect this work to be useful for future developments of full-text search engines in mobile environments.
2022
Autores
Rodrigues, J; Lopes, CT;
Publicação
LINKING THEORY AND PRACTICE OF DIGITAL LIBRARIES (TPDL 2022)
Abstract
Research data management is an essential process in scientific research activities. It includes monitoring data from the moment it is created until it is deposited in a repository so that later it can be accessed and reused by others. Sharing and reuse are the last steps in this process. It is essential to ensure that the data stored in digital repositories is well preserved in the long term and that its adequate interpretation and future reuse is guaranteed. Following this debate, questions arise related to the interoperability of systems and the suitability of platforms. In this study, we study how data management platforms can solve the problems associated with description, preservation, and access in digital media, making their usefulness evident. We identify some of the most relevant repository platforms in the scope of research data management, offering the scientific community an aggregating view of the various solutions and their main characteristics, thus aiming at a better understanding of them for their appropriate choice.
2022
Autores
Dias, M; Lopes, CT;
Publicação
Proceedings of the 26th International Conference on Theory and Practice of Digital Libraries - Workshops and Doctoral Consortium, Padua, Italy, September 20, 2022.
Abstract
Linked Data is used in various fields as a new way of structuring and connecting data. Cultural heritage institutions have been using linked data to improve archival descriptions and promote findability. The required detail in manual descriptions of cultural heritage objects can be taxing and time-consuming. Given this, in EPISA, a research project on this topic, we propose to use the contents of the digital representations associated with the objects to assist archivists in their description tasks. More specifically, to extract information from the digital representations useful for an initial ontology population that should be validated or edited by the archivist. We apply optical character recognition in an initial stage to convert the digital representation to a machine-readable format. We then use ontology-oriented programming to identify and instantiate ontology concepts using neural networks and contextual embeddings. © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)
2022
Autores
Rodrigues, J; Teixeira Lopes, C;
Publicação
Journal of Library Metadata
Abstract
Research data management (RDM) includes people with different needs, specific scientific contexts, and diverse requirements. The description is a big challenge in the domain of RDM. Metadata plays an essential role, allowing the inclusion of essential information for the interpretation of data, enhances the reuse of data and its preservation. The establishment of metadata models can facilitate the process of description and contribute to an improvement in the quality of metadata. When we talk about image data, the task is even more difficult, as there are no explicit recommendations to guide image management. In this work, we present a proposal for a metadata model for image description. To validate the model, we followed an experiment of data description, where eleven participants described images from their research projects, using a metadata model proposed. The experiment shows that participants do not have formal practices for describing their imagery data. Yet, they provided valuable contributions and recommendations to the final definition of a metadata model for image description, to date nonexistent. We also developed controlled vocabularies for some descriptors. These vocabularies aim to improve the image description process, facilitate metadata model interpretation, and reduce the time and effort devoted to data description. © 2022 Joana Rodrigues and Carla Teixeira Lopes Published with license by Taylor & Francis Group, LLC.
2022
Autores
Lopes, CT;
Publicação
CoRR
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.