Publications

Publications by Joana Vale Sousa

2022

Lung Segmentation in CT Images: A Residual U-Net Approach on a Cross-Cohort Dataset

Authors
Sousa, J; Pereira, T; Silva, F; Silva, MC; Vilares, AT; Cunha, A; Oliveira, HP;

Publication
APPLIED SCIENCES-BASEL

Abstract
Lung cancer is one of the most common causes of cancer-related mortality, and since the majority of cases are diagnosed when the tumor is in an advanced stage, the 5-year survival rate is dismally low. Nevertheless, the chances of survival can increase if the tumor is identified early on, which can be achieved through screening with computed tomography (CT). The clinical evaluation of CT images is a very time-consuming task and computed-aided diagnosis systems can help reduce this burden. The segmentation of the lungs is usually the first step taken in image analysis automatic models of the thorax. However, this task is very challenging since the lungs present high variability in shape and size. Moreover, the co-occurrence of other respiratory comorbidities alongside lung cancer is frequent, and each pathology can present its own scope of CT imaging appearances. This work investigated the development of a deep learning model, whose architecture consists of the combination of two structures, a U-Net and a ResNet34. The proposed model was designed on a cross-cohort dataset and it achieved a mean dice similarity coefficient (DSC) higher than 0.93 for the 4 different cohorts tested. The segmentation masks were qualitatively evaluated by two experienced radiologists to identify the main limitations of the developed model, despite the good overall performance obtained. The performance per pathology was assessed, and the results confirmed a small degradation for consolidation and pneumocystis pneumonia cases, with a DSC of 0.9015 +/- 0.2140 and 0.8750 +/- 0.1290, respectively. This work represents a relevant assessment of the lung segmentation model, taking into consideration the pathological cases that can be found in the clinical routine, since a global assessment could not detail the fragilities of the model.

CloseRead Abstract

2022

Towards Machine Learning-Aided Lung Cancer Clinical Routines: Approaches and Open Challenges

Authors
Silva, F; Pereira, T; Neves, I; Morgado, J; Freitas, C; Malafaia, M; Sousa, J; Fonseca, J; Negrao, E; de Lima, BF; da Silva, MC; Madureira, AJ; Ramos, I; Costa, JL; Hespanhol, V; Cunha, A; Oliveira, HP;

Publication
JOURNAL OF PERSONALIZED MEDICINE

Abstract
Advancements in the development of computer-aided decision (CAD) systems for clinical routines provide unquestionable benefits in connecting human medical expertise with machine intelligence, to achieve better quality healthcare. Considering the large number of incidences and mortality numbers associated with lung cancer, there is a need for the most accurate clinical procedures; thus, the possibility of using artificial intelligence (AI) tools for decision support is becoming a closer reality. At any stage of the lung cancer clinical pathway, specific obstacles are identified and motivate the application of innovative AI solutions. This work provides a comprehensive review of the most recent research dedicated toward the development of CAD tools using computed tomography images for lung cancer-related tasks. We discuss the major challenges and provide critical perspectives on future directions. Although we focus on lung cancer in this review, we also provide a more clear definition of the path used to integrate AI in healthcare, emphasizing fundamental research points that are crucial for overcoming current barriers.

CloseRead Abstract

2022

The Influence of a Coherent Annotation and Synthetic Addition of Lung Nodules for Lung Segmentation in CT Scans

Authors
Sousa, J; Pereira, T; Neves, I; Silva, F; Oliveira, HP;

Publication
SENSORS

Abstract
Lung cancer is a highly prevalent pathology and a leading cause of cancer-related deaths. Most patients are diagnosed when the disease has manifested itself, which usually is a sign of lung cancer in an advanced stage and, as a consequence, the 5-year survival rates are low. To increase the chances of survival, improving the cancer early detection capacity is crucial, for which computed tomography (CT) scans represent a key role. The manual evaluation of the CTs is a time-consuming task and computer-aided diagnosis (CAD) systems can help relieve that burden. The segmentation of the lung is one of the first steps in these systems, yet it is very challenging given the heterogeneity of lung diseases usually present and associated with cancer development. In our previous work, a segmentation model based on a ResNet34 and U-Net combination was developed on a cross-cohort dataset that yielded good segmentation masks for multiple pathological conditions but misclassified some of the lung nodules. The multiple datasets used for the model development were originated from different annotation protocols, which generated inconsistencies for the learning process, and the annotations are usually not adequate for lung cancer studies since they did not comprise lung nodules. In addition, the initial datasets used for training presented a reduced number of nodules, which was showed not to be enough to allow the segmentation model to learn to include them as a lung part. In this work, an objective protocol for the lung mask's segmentation was defined and the previous annotations were carefully reviewed and corrected to create consistent and adequate ground-truth masks for the development of the segmentation model. Data augmentation with domain knowledge was used to create lung nodules in the cases used to train the model. The model developed achieved a Dice similarity coefficient (DSC) above 0.9350 for all test datasets and it showed an ability to cope, not only with a variety of lung patterns, but also with the presence of lung nodules as well. This study shows the importance of using consistent annotations for the supervised learning process, which is a very time-consuming task, but that has great importance to healthcare applications. Due to the lack of massive datasets in the medical field, which consequently brings a lack of wide representativity, data augmentation with domain knowledge could represent a promising help to overcome this limitation for learning models development.

CloseRead Abstract

2025

Incrementally Learning to Segment the Lungs: Similarities and Differences Across Institutions

Authors
Sousa, JV; Oliveira, HP; Pereira, T;

Publication
2025 IEEE 25th International Conference on Bioinformatics and Bioengineering (BIBE)

Abstract

2025

Clinical Data-Driven Modeling of Disease-Specific Survival in Lung Cancer: Insights from the National Lung Screening Trial Dataset

Authors
Amaro, M; Sousa, JV; Gouveia, M; Oliveira, HP; Pereira, T;

Publication
Measurement and Evaluations in Cancer Care

Abstract

2025

Clinical Annotation and Medical Image Anonymization for AI Model Training in Lung Cancer Detection

Authors
Freire, AM; Rodrigues, EM; Sousa, JV; Gouveia, M; Ferreira-Santos, D; Pereira, T; Oliveira, HP; Sousa, P; Silva, AC; Fernandes, MS; Hespanhol, V; Araújo, J;

Publication
UNIVERSAL ACCESS IN HUMAN-COMPUTER INTERACTION, UAHCI 2025, PT I

Abstract
Lung cancer remains one of the most common and lethal forms of cancer, with approximately 1.8 million deaths annually, often diagnosed at advanced stages. Early detection is crucial, but it depends on physicians' accurate interpretation of computed tomography (CT) scans, a process susceptible to human limitations and variability. ByMe has developed a medical image annotation and anonymization tool designed to address these challenges through a human-centered approach. The tool enables physicians to seamlessly add structured attribute-based annotations (e.g., size, location, morphology) directly within their established workflows, ensuring intuitive interaction.Integrated with Picture Archiving and Communication Systems (PACS), the tool streamlines the annotation process and enhances usability by offering a dedicated worklist for retrospective and prospective case analysis. Robust anonymization features ensure compliance with privacy regulations such as the General Data Protection Regulation (GDPR), enabling secure dataset sharing for research and developing artificial intelligence (AI) models. Designed to empower AI integration, the tool not only facilitates the creation of high-quality datasets but also lays the foundation for incorporating AI-driven insights directly into clinical workflows. Focusing on usability, workflow integration, and privacy, this innovation bridges the gap between precision medicine and advanced technology. By providing the means to develop and train AI models for lung cancer detection, it holds the potential to significantly accelerate diagnosis as well as enhance its accuracy and consistency.

CloseRead Abstract