Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

2023

NLP-Crowdsourcing Hybrid Framework for Inter-Researcher Similarity Detection

Autores
Correia, A; Guimaraes, D; Paredes, H; Fonseca, B; Paulino, D; Trigo, L; Brazdil, P; Schneider, D; Grover, A; Jameel, S;

Publicação
IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS

Abstract
Visualizing and examining the intellectual landscape and evolution of scientific communities to support collaboration is crucial for multiple research purposes. In some cases, measuring similarities and matching patterns between research publication document sets can help to identify people with similar interests for building research collaboration networks and university-industry linkages. The premise of this work is assessing feasibility for resolving ambiguous cases in similarity detection to determine authorship with natural language processing (NLP) techniques so that crowdsourcing is applied only in instances that require human judgment. Using an NLP-crowdsourcing convergence strategy, we can reduce the costs of microtask crowdsourcing while saving time and maintaining disambiguation accuracy over large datasets. This article contributes a next-gen crowd-artificial intelligence framework that used an ensemble of term frequency-inverse document frequency and bidirectional encoder representation from transformers to obtain similarity rankings for pairs of scientific documents. A sequence of content-based similarity tasks was created using a crowd-powered interface for solving disambiguation problems. Our experimental results suggest that an adaptive NLP-crowdsourcing hybrid framework has advantages for inter-researcher similarity detection tasks where fully automatic algorithms provide unsatisfactory results, with the goal of helping researchers discover potential collaborators using data-driven approaches.

2023

Market integration analysis of heat recovery under the EMB3Rs platform: An industrial park case in Greece

Autores
Faria, AS; Soares, T; Goumas, G; Abotzios, A; Cunha, JM; Silva, M;

Publicação
2023 OPEN SOURCE MODELLING AND SIMULATION OF ENERGY SYSTEMS, OSMSES

Abstract
This work aims to present a thorough study of a district heating scenario in a Greek industrial park case. The work is supported by the EMB3Rs open-source platform, allowing to perform a feasibility analysis of the system. In particular, this work explores the market module of this platform to provide a detailed market analysis of energy exchange within the Greek industrial park. The results pinpoint the effectiveness of the platform in simulating different market designs like centralized and decentralized, making clear the potential benefit the sources in the test case may achieve by engaging in a market framework. Different options for market clearing are considered in the study, for instance, including CO2 signals to reach carbon neutrality or community preferences to increase community autonomy. One can conclude that excess heat from existing sources is enough to cover other industries/facilities' heat demand, leading to environmental benefits as well as a fairer financial profits allocation.

2023

Temporal variability of gamma radiation and aerosol concentration over the North Atlantic ocean

Autores
Dias, N; Amaral, G; Almeida, C; Ferreira, A; Camilo, A; Silva, E; Barbosa, S;

Publicação

Abstract
<p>Gamma radiation measured over the ocean is mainly due to airborne radionuclides, as gamma emission by radon degassing from the ocean is negligible. Airborne gamma-emitting elements include radon progeny (Pb-2114, Bi-214, Pb-210) and cosmogenic radionuclides such as Be-7. Radon progeny attaches readily to aerosols, thus the fate of gamma-emitting radon progeny, after its formation by radioactive decay from radon, is expected to be closely linked to that of aerosols.</p> <p>Gamma radiation measurements over the Atlantic Ocean were made on board the ship-rigged sailing ship NRP Sagres in the framework of project SAIL (Space-Atmosphere-Ocean Interactions in the marine boundary Layer). The measurements were performed continuously with a NaI(Tl) scintillator counting all gamma rays from 475 keV to 3 MeV.  </p> <p>The counts from the sensor were recorded every 1 second into a computer system which had his time reference corrected by a GNSS pulse per second (PPS) signal. The GNSS was also used to precisely position the ship. The measurements were performed over the Atlantic ocean from January to May 2020, along the ship’s round trip from Lisboa - Cape Verde – Rio de Janeiro – Buenos Aires – Cape Town – Cape Verde - Lisboa.</p> <p>The results show that the gamma radiation time series displays considerable higher counts and larger variability in January compared to the remaining period. Reanalysis data also indicate higher aerosol concentration. This work investigates in detail the association between the temporal evolution of the gamma radiation measurements obtained from the SAIL campaign over the Atlantic Ocean and co-located total aerosol concentration at 550 nm obtained every 3 hours from EAC4(ECMWF Atmospheric Composition Reanalysis 4) data.</p>

2023

A Biomedical Entity Extraction Pipeline for Oncology Health Records in Portuguese

Autores
Sousa, H; Pasquali, A; Jorge, A; Santos, CS; Lopes, MA;

Publicação
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023

Abstract
Textual health records of cancer patients are usually protracted and highly unstructured, making it very time-consuming for health professionals to get a complete overview of the patient's therapeutic course. As such limitations can lead to suboptimal and/or inefficient treatment procedures, healthcare providers would greatly benefit from a system that effectively summarizes the information of those records. With the advent of deep neural models, this objective has been partially attained for English clinical texts, however, the research community still lacks an effective solution for languages with limited resources. In this paper, we present the approach we developed to extract procedures, drugs, and diseases from oncology health records written in European Portuguese. This project was conducted in collaboration with the Portuguese Institute for Oncology which, besides holding over 10 years of duly protected medical records, also provided oncologist expertise throughout the development of the project. Since there is no annotated corpus for biomedical entity extraction in Portuguese, we also present the strategy we followed in annotating the corpus for the development of the models. The final models, which combined a neural architecture with entity linking, achieved..1 scores of 88.6, 95.0, and 55.8 per cent in the mention extraction of procedures, drugs, and diseases, respectively.

2023

Microservices Refactoring Tools - Paper Appendix

Autores
Fritzsch, J; Correia, FF; Bogner, J; Wagner, S;

Publicação

Abstract

2023

Dynamic Management of Distributed Machine Learning Projects

Autores
Oliveira, F; Alves, A; Moço, H; Monteiro, J; Oliveira, O; Carneiro, D; Novais, P;

Publicação
INTELLIGENT DISTRIBUTED COMPUTING XV, IDC 2022

Abstract
Given the new requirements of Machine Learning problems in the last years, especially in what concerns the volume, diversity and speed of data, new approaches are needed to deal with the associated challenges. In this paper we describe CEDEs - a distributed learning system that runs on top of an Hadoop cluster and takes advantage of blocks, replication and balancing. CEDEs trains models in a distributed manner following the principle of data locality, and is able to change parts of the model through an optimization module, thus allowing a model to evolve over time as the data changes. This paper describes its generic architecture, details the implementation of the first modules, and provides a first validation.

  • 576
  • 4387