Publicacoes - INESC TEC

Publicações

Publicações por Miriam Seoane Santos

2018

Missing data imputation via denoising autoencoders: The untold story

Autores
Costa, AF; Santos, MS; Soares, JP; Abreu, PH;

Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
Missing data consists in the lack of information in a dataset and since it directly influences classification performance, neglecting it is not a valid option. Over the years, several studies presented alternative imputation strategies to deal with the three missing data mechanisms, Missing Completely At Random, Missing At Random and Missing Not At Random. However, there are no studies regarding the influence of all these three mechanisms on the latest high-performance Artificial Intelligence techniques, such as Deep Learning. The goal of this work is to perform a comparison study between state-of-the-art imputation techniques and a Stacked Denoising Autoencoders approach. To that end, the missing data mechanisms were synthetically generated in 6 different ways; 8 different imputation techniques were implemented; and finally, 33 complete datasets from different open source repositories were selected. The obtained results showed that Support Vector Machines imputation ensures the best classification performance while Multiple Imputation by Chained Equations performs better in terms of imputation quality. © Springer Nature Switzerland AG 2018.

FecharLer Abstract

2020

Bone scintigraphy and PET-CT: A necessary alliance for bone metastasis detection in breast cancer?

Autores
Santos, JC; Abreu, MH; Santos, MS; Duarte, H; Alpoim, T; Sousa, S; Abreu, PH;

Publicação
JOURNAL OF CLINICAL ONCOLOGY

Abstract
e13070 Background: Bone is one of the main sites of breast cancer metastasis. Staging of this kind of disease spread can be performed in locally advanced cases with PET-CT in conjunction with Bone Scintigraphy. The purpose of this work is to compare the efficiency of bone metastasis detection between PET-CT and bone scintigraphy. Methods: Prospective analysis of locally advanced breast cancer patients treated in a Comprehensive Cancer Center between 2014 and 2019 that performed PET-CT and Bone Scintigraphy in the staging. Interval between the two exams could not exceed 2 months. Clinical and pathological characteristics of the disease were collected from electronic files and independently clinical images reports were considered to evaluate the ability of each imaging modalities to identify bone disease. In discrepancy cases a re-analysis of the images by two independent nuclear physicians was performed to validate the findings. Results: We analyzed 204 cases. The majority of them had ductal carcinomas (72.5%), cT2/3 (70%), cN1/2(61.8%) and G2/3 (94.6%), luminal B- like, HER2 positive disease (49.2%). In this cohort, bone metastasis was documented in 52 (25.5%) patients. PET-CT presented 97.0% of accuracy, surpassing the 94.1% presented by Bone Scintigraphy. The latter failed to correctly detect bone metastasis in 11 (5.4%) patients and only outperformed PET-CT in 3 (1.5%) patients. The main difference between the two modalities was the non-detection of cranium lesions in PET-CT images. Conclusions: PET-CT showed higher efficiency in bone metastasis detection than Bone Scintigraphy, probably because it detects lytic lesions. The non-detection of cranium ones can be harmful and so modifications in the image acquisition are required to improve the quality of PET-CT, avoiding other exams in bone staging.

FecharLer Abstract

2015

A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients

Autores
Santos, MS; Abreu, PH; Garcia Laencina, PJ; Simao, A; Carvalho, A;

Publicação
JOURNAL OF BIOMEDICAL INFORMATICS

Abstract
Liver cancer is the sixth most frequently diagnosed cancer and, particularly, Hepatocellular Carcinoma (HCC) represents more than 90% of primary liver cancers. Clinicians assess each patient's treatment on the basis of evidence-based medicine, which may not always apply to a specific patient, given the biological variability among individuals. Over the years, and for the particular case of Hepatocellular Carcinoma, some research studies have been developing strategies for assisting clinicians in decision making, using computational methods (e.g. machine learning techniques) to extract knowledge from the clinical data. However, these studies have some limitations that have not yet been addressed: some do not focus entirely on Hepatocellular Carcinoma patients, others have strict application boundaries, and none considers the heterogeneity between patients nor the presence of missing data, a common drawback in healthcare contexts. In this work, a real complex Hepatocellular Carcinoma database composed of heterogeneous clinical features is studied. We propose a new cluster-based oversampling approach robust to small and imbalanced datasets, which accounts for the heterogeneity of patients with Hepatocellular Carcinoma. The preprocessing procedures of this work are based on data imputation considering appropriate distance metrics for both heterogeneous and missing data (HEOM) and clustering studies to assess the underlying patient groups in the studied dataset (K-means). The final approach is applied in order to diminish the impact of underlying patient profiles with reduced sizes on survival prediction. It is based on K-means clustering and the SMOTE algorithm to build a representative dataset and use it as training example for different machine learning procedures (logistic regression and neural networks). The results are evaluated in terms of survival prediction and compared across baseline approaches that do not consider clustering and/or oversampling using the Friedman rank test. Our proposed methodology coupled with neural networks outperformed all others, suggesting an improvement over the classical approaches currently used in Hepatocellular Carcinoma prediction models.

FecharLer Abstract