2023
Autores
Silva, D; Agrotis, G; Tan, RB; Teixeira, LF; Silva, W;
Publicação
International Conference on Machine Learning and Applications, ICMLA 2023, Jacksonville, FL, USA, December 15-17, 2023
Abstract
Deep Learning models are tremendously valuable in several prediction tasks, and their use in the medical field is spreading abruptly, especially in computer vision tasks, evaluating the content in X-rays, CTs or MRIs. These methods can save a significant amount of time for doctors in patient diagnostics and help in treatment planning. However, these models are significantly sensitive to confounders in the training data and generally suffer a performance hit when dealing with out-of-distribution data, affecting their reliability and scalability in different medical institutions. Deep Learning research on Medical datasets may overlook essential details regarding the image acquisition procedure and the preprocessing steps. This work proposes a data-centric approach, exploring the potential of attention maps as a regularisation technique to improve robustness and generalisation. We use image metadata and explore self-attention maps and contrastive learning to promote feature space invariance to image disturbance. Experiments were conducted using Chest X-ray datasets that are publicly available. Some datasets contained information about the windowing settings applied by the radiologist, acting as a source of variability. The proposed model was tested and outperformed the baseline in out-of-distribution data, serving as a proof of concept. © 2023 IEEE.
2023
Autores
Patrício, C; Teixeira, LF; Neves, JC;
Publicação
CoRR
Abstract
2023
Autores
Magalhaes, SC; dos Santos, FN; Machado, P; Moreira, AP; Dias, J;
Publicação
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
Abstract
Purpose: Visual perception enables robots to perceive the environment. Visual data is processed using computer vision algorithms that are usually time-expensive and require powerful devices to process the visual data in real-time, which is unfeasible for open-field robots with limited energy. This work benchmarks the performance of different heterogeneous platforms for object detection in real-time. This research benchmarks three architectures: embedded GPU-Graphical Processing Units (such as NVIDIA Jetson Nano 2 GB and 4 GB, and NVIDIA Jetson TX2), TPU-Tensor Processing Unit (such as Coral Dev Board TPU), and DPU-Deep Learning Processor Unit (such as in AMD-Xilinx ZCU104 Development Board, and AMD-Xilinx Kria KV260 Starter Kit). Methods: The authors used the RetinaNet ResNet-50 fine-tuned using the natural VineSet dataset. After the trained model was converted and compiled for target-specific hardware formats to improve the execution efficiency.Conclusions and Results: The platforms were assessed in terms of performance of the evaluation metrics and efficiency (time of inference). Graphical Processing Units (GPUs) were the slowest devices, running at 3 FPS to 5 FPS, and Field Programmable Gate Arrays (FPGAs) were the fastest devices, running at 14 FPS to 25 FPS. The efficiency of the Tensor Processing Unit (TPU) is irrelevant and similar to NVIDIA Jetson TX2. TPU and GPU are the most power-efficient, consuming about 5 W. The performance differences, in the evaluation metrics, across devices are irrelevant and have an F1 of about 70 % and mean Average Precision (mAP) of about 60 %.
2023
Autores
Romero, A; Carvalho, P; Corte-Real, L; Pereira, A;
Publicação
JOURNAL OF IMAGING
Abstract
The problem of gathering sufficiently representative data, such as those about human actions, shapes, and facial expressions, is costly and time-consuming and also requires training robust models. This has led to the creation of techniques such as transfer learning or data augmentation. However, these are often insufficient. To address this, we propose a semi-automated mechanism that allows the generation and editing of visual scenes with synthetic humans performing various actions, with features such as background modification and manual adjustments of the 3D avatars to allow users to create data with greater variability. We also propose an evaluation methodology for assessing the results obtained using our method, which is two-fold: (i) the usage of an action classifier on the output data resulting from the mechanism and (ii) the generation of masks of the avatars and the actors to compare them through segmentation. The avatars were robust to occlusion, and their actions were recognizable and accurate to their respective input actors. The results also showed that even though the action classifier concentrates on the pose and movement of the synthetic humans, it strongly depends on contextual information to precisely recognize the actions. Generating the avatars for complex activities also proved problematic for action recognition and the clean and precise formation of the masks.
2023
Autores
Pereira, T; Cunha, A; Oliveira, HP;
Publicação
APPLIED SCIENCES-BASEL
Abstract
2023
Autores
Ribeiro, G; Pereira, T; Silva, F; Sousa, J; Carvalho, DC; Dias, SC; Oliveira, HP;
Publicação
APPLIED SCIENCES-BASEL
Abstract
Bone marrow edema (BME) is the term given to the abnormal fluid signal seen within the bone marrow on magnetic resonance imaging (MRI). It usually indicates the presence of underlying pathology and is associated with a myriad of conditions/causes. However, it can be misleading, as in some cases, it may be associated with normal changes in the bone, especially during the growth period of childhood, and objective methods for assessment are lacking. In this work, learning models for BME detection were developed. Transfer learning was used to overcome the size limitations of the dataset, and two different regions of interest (ROI) were defined and compared to evaluate their impact on the performance of the model: bone segmention and intensity mask. The best model was obtained for the high intensity masking technique, which achieved a balanced accuracy of 0.792 +/- 0.034. This study represents a comparison of different models and data regularization techniques for BME detection and showed promising results, even in the most difficult range of ages: children and adolescents. The application of machine learning methods will help to decrease the dependence on the clinicians, providing an initial stratification of the patients based on the probability of edema presence and supporting their decisions on the diagnosis.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.