Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2022

Privacy-Preserving Machine Learning in Life Insurance Risk Prediction

Autores
Pereira, K; Vinagre, J; Alonso, AN; Coelho, F; Carvalho, M;

Publicação
Machine Learning and Principles and Practice of Knowledge Discovery in Databases - International Workshops of ECML PKDD 2022, Grenoble, France, September 19-23, 2022, Proceedings, Part II

Abstract
The application of machine learning to insurance risk prediction requires learning from sensitive data. This raises multiple ethical and legal issues. One of the most relevant ones is privacy. However, privacy-preserving methods can potentially hinder the predictive potential of machine learning models. In this paper, we present preliminary experiments with life insurance data using two privacy-preserving techniques: discretization and encryption. Our objective with this work is to assess the impact of such privacy preservation techniques in the accuracy of ML models. We instantiate the problem in three general, but plausible Use Cases involving the prediction of insurance claims within a 1-year horizon. Our preliminary experiments suggest that discretization and encryption have negligible impact in the accuracy of ML models. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

2022

Detection of Loanwords in Angolan Portuguese: A Text Mining Approach

Autores
Muhongo, TS; Brazdil, PB; Silva, F;

Publicação
INTELIGENCIA ARTIFICIAL-IBEROAMERICAL JOURNAL OF ARTIFICIAL INTELLIGENCE

Abstract
Angola is characterized by many different languages and social, cultural and political realities, which had a marked effect on Angolan Portuguese (AP). Consequently, AP is characterized by diatopic variation. One of the marked effects is the loanwords imported from other Angolan languages. Our objective is to analyze different Angolan texts, analyze the lexical forms used and conduct a comparative study with European Portuguese, aiming at identifying the possible loanwords in Angolan Portuguese. This process was automated, as well as the identification of all loanwords' cotexts. In addition, we determine the lexical class of each loanword and the Angolan language of its origin. Most lexical loanwords come from the Kimbundu, although AP includes loanwords from some other Angolan languages too. Our study serves as a basis for preparing an Angolan regionalism dictionary. We noticed that more than 700 identified loanwords do not figure in the existing dictionaries.

2022

Semi-Automatic Approaches for Exploiting Shifter Patterns in Domain-Specific Sentiment Analysis

Autores
Brazdil, P; Muhammad, SH; Oliveira, F; Cordeiro, J; Silva, F; Silvano, P; Leal, A;

Publicação
MATHEMATICS

Abstract
This paper describes two different approaches to sentiment analysis. The first is a form of symbolic approach that exploits a sentiment lexicon together with a set of shifter patterns and rules. The sentiment lexicon includes single words (unigrams) and is developed automatically by exploiting labeled examples. The shifter patterns include intensification, attenuation/downtoning and inversion/reversal and are developed manually. The second approach exploits a deep neural network, which uses a pre-trained language model. Both approaches were applied to texts on economics and finance domains from newspapers in European Portuguese. We show that the symbolic approach achieves virtually the same performance as the deep neural network. In addition, the symbolic approach provides understandable explanations, and the acquired knowledge can be communicated to others. We release the shifter patterns to motivate future research in this direction.

2022

Advances in Metalearning: ECML/PKDD Workshop on Meta-Knowledge Transfer

Autores
Brazdil, P; van Rijn, JN; Gouk, H; Mohr, F;

Publicação
ECML/PKDD Workshop on Meta-Knowledge Transfer, 23 September 2022, Grenoble, France

Abstract

2022

ECML/PKDD Workshop on Meta-Knowledge Transfer, 23 September 2022, Grenoble, France

Autores
Brazdil, P; van Rijn, JN; Gouk, H; Mohr, F;

Publicação
Meta-Knowledge Transfer @ ECML/PKDD

Abstract

2022

Assessing clinical applicability of COVID-19 detection in chest radiography with deep learning

Autores
Pedrosa, J; Aresta, G; Ferreira, C; Carvalho, C; Silva, J; Sousa, P; Ribeiro, L; Mendonca, AM; Campilho, A;

Publicação
SCIENTIFIC REPORTS

Abstract
The coronavirus disease 2019 (COVID-19) pandemic has impacted healthcare systems across the world. Chest radiography (CXR) can be used as a complementary method for diagnosing/following COVID-19 patients. However, experience level and workload of technicians and radiologists may affect the decision process. Recent studies suggest that deep learning can be used to assess CXRs, providing an important second opinion for radiologists and technicians in the decision process, and super-human performance in detection of COVID-19 has been reported in multiple studies. In this study, the clinical applicability of deep learning systems for COVID-19 screening was assessed by testing the performance of deep learning systems for the detection of COVID-19. Specifically, four datasets were used: (1) a collection of multiple public datasets (284.793 CXRs); (2) BIMCV dataset (16.631 CXRs); (3) COVIDGR (852 CXRs) and 4) a private dataset (6.361 CXRs). All datasets were collected retrospectively and consist of only frontal CXR views. A ResNet-18 was trained on each of the datasets for the detection of COVID-19. It is shown that a high dataset bias was present, leading to high performance in intradataset train-test scenarios (area under the curve 0.55-0.84 on the collection of public datasets). Significantly lower performances were obtained in interdataset train-test scenarios however (area under the curve > 0.98). A subset of the data was then assessed by radiologists for comparison to the automatic systems. Finetuning with radiologist annotations significantly increased performance across datasets (area under the curve 0.61-0.88) and improved the attention on clinical findings in positive COVID-19 CXRs. Nevertheless, tests on CXRs from different hospital services indicate that the screening performance of CXR and automatic systems is limited (area under the curve < 0.6 on emergency service CXRs). However, COVID-19 manifestations can be accurately detected when present, motivating the use of these tools for evaluating disease progression on mild to severe COVID-19 patients.

  • 112
  • 515