2023
Autores
Santos, C; Cunha, A; Coelho, P;
Publicação
Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST
Abstract
Automatic Lip-Reading (ALR), also known as Visual Speech Recognition (VSR), is the technological process to extract and recognize speech content, based solely on the visual recognition of the speaker’s lip movements. Besides hearing-impaired people, regular hearing people also resort to visual cues for word disambiguation, every time one is in a noisy environment. Due to the increasingly interest in developing ALR systems, a considerable number of research articles are being published. This article selects, analyses, and summarizes the main papers from 2018 to early 2022, from traditional methods with handcrafted feature extraction algorithms to end-to-end deep learning based ALR which fully take advantage of learning the best features, and of the evergrowing publicly available databases. By providing a recent state-of-the-art overview, identifying trends, and presenting a conclusion on what is to be expected in future work, this article becomes an efficient way to update on the most relevant ALR techniques. © 2023, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.
2023
Autores
Gonzalez, DG; Carias, J; Castilla, YC; Rodrigues, J; Adão, T; Jesus, R; Magalhães, LGM; de Sousa, VML; Carvalho, L; Almeida, R; Cunha, A;
Publicação
Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST
Abstract
Cancer diagnosis is of major importance in the field of human medical pathology, wherein a cell division process known as mitosis constitutes a relevant biological pattern analyzed by professional experts, who seek for such occurrence in presence and number through visual observation of microscopic imagery. This is a time-consuming and exhausting task that can benefit from modern artificial intelligence approaches, namely those handling object detection through deep learning, from which YOLO can be highlighted as one of the most successful, and, as such, a good candidate for performing automatic mitoses detection. Considering that low sensibility for rotation/flip variations is of high importance to ensure mitosis deep detection robustness, in this work, we propose an offline augmentation procedure focusing rotation operations, to address the impact of lost/clipped mitoses induced by online augmentation. YOLOv4 and YOLOv5 were compared, using an augmented test dataset with an exhaustive set of rotation angles, to investigate their performance. YOLOv5 with a mixture of offline and online rotation augmentation methods presented the best averaged F1-score results over three runs. © 2023, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.
2023
Autores
Rezende, RF; Coelho, A; Fernandes, R; Camara, J; Neto, A; Cunha, A;
Publicação
Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST
Abstract
Glaucoma is a disease that arises from increased intraocular pressure and leads to irreversible partial or total loss of vision. Due to the lack of symptoms, this disease often progresses to more advanced stages, not being detected in the early phase. The screening of glaucoma can be made through visualization of the retina, through retinal images captured by medical equipment or mobile devices with an attached lens to the camera. Deep learning can enhance and increase mass glaucoma screening. In this study, domain transfer learning technique is important to better weight initialization and for understanding features more related to the problem. For this, classic convolutional neural networks, such as ResNet50 will be compared with Vision Transformers, in high and low-resolution images. The high-resolution retinal image will be used to pre-trained the network and use that knowledge for detecting glaucoma in retinal images captured by mobile devices. The ResNet50 model reached the highest values of AUC in the high-resolution dataset, being the more consistent model in all the experiments. However, the Vision Transformer proved to be a promising technique, especially in low-resolution retinal images. © 2023, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.
2022
Autores
Esengönül, M; de Paiva, AC; Rodrigues, JMF; Cunha, A;
Publicação
Wireless Mobile Communication and Healthcare - 11th EAI International Conference, MobiHealth 2022, Virtual Event, November 30 - December 2, 2022, Proceedings
Abstract
Diabetes has significant effects on the human body, one of which is the increase in the blood pressure and when not diagnosed early, can cause severe vision complications and even lead to blindness. Early screening is the key to overcoming such issues which can have a significant impact on rural areas and overcrowded regions. Mobile systems can help bring the technology to those in need. Transfer learning based Deep Learning algorithms combined with mobile retinal imaging systems can significantly reduce the screening time and lower the burden on healthcare workers. In this paper, several efficiency factors of Diabetic Retinopathy detection systems based on Convolutional Neural Networks are tested and evaluated for mobile applications. Two main techniques are used to measure the efficiency of DL based DR detection systems. The first method evaluates the effect of dataset change, where the base architecture of the DL model remains the same. The second method measures the effect of base architecture variation, where the dataset remains unchanged. The results suggest that the inclusivity of the datasets, and the dataset size significantly impact the DR detection accuracy and sensitivity. Amongst the five chosen lightweight architectures, EfficientNet-based DR detection algorithms outperformed the other transfer learning models along with APTOS Blindness Detection dataset. © 2023, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.
2022
Autores
Neto, A; Ferreira, S; Libânio, D; Ribeiro, MD; Coimbra, MT; Cunha, A;
Publicação
Wireless Mobile Communication and Healthcare - 11th EAI International Conference, MobiHealth 2022, Virtual Event, November 30 - December 2, 2022, Proceedings
Abstract
Precancerous conditions such as intestinal metaplasia (IM) have a key role in gastric cancer development and can be detected during endoscopy. During upper gastrointestinal endoscopy (UGIE), misdiagnosis can occur due to technical and human factors or by the nature of the lesions, leading to a wrong diagnosis which can result in no surveillance/treatment and impairing the prevention of gastric cancer. Deep learning systems show great potential in detecting precancerous gastric conditions and lesions by using endoscopic images and thus improving and aiding physicians in this task, resulting in higher detection rates and fewer operation errors. This study aims to develop deep learning algorithms capable of detecting IM in UGIE images with a focus on model explainability and interpretability. In this work, white light and narrow-band imaging UGIE images collected in the Portuguese Institute of Oncology of Porto were used to train deep learning models for IM classification. Standard models such as ResNet50, VGG16 and InceptionV3 were compared to more recent algorithms that rely on attention mechanisms, namely the Vision Transformer (ViT), trained in 818 UGIE images (409 normal and 409 IM). All the models were trained using a 5-fold cross-validation technique and for validation, an external dataset will be tested with 100 UGIE images (50 normal and 50 IM). In the end, explainability methods (Grad-CAM and attention rollout) were used for more clear and more interpretable results. The model which performed better was ResNet50 with a sensitivity of 0.75 (±0.05), an accuracy of 0.79 (±0.01), and a specificity of 0.82 (±0.04). This model obtained an AUC of 0.83 (±0.01), where the standard deviation was 0.01, which means that all iterations of the 5-fold cross-validation have a more significant agreement in classifying the samples than the other models. The ViT model showed promising performance, reaching similar results compared to the remaining models. © 2023, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.
2023
Autores
Correia, T; Cunha, A; Coelho, P;
Publicação
Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST
Abstract
Glaucoma is a severe disease that arises from low intraocular pressure, it is asymptomatic in the initial stages and can lead to blindness, due to its degenerative characteristic. There isn’t any available cure for it, and it is the second most common cause of blindness in the world. Regular visits to the ophthalmologist are the best way to prevent or contain it, with a precise diagnosis performed with professional equipment. From another perspective, for some individuals or populations, this task can be difficult to accomplish, due to several restrictions, such as low incoming resources, geographical adversities, and traveling restrictions (distance, lack of means of transportation, etc.). Also, logistically, due to its dimensions, relocating the professional equipment can be expensive, thus becoming inviable to bring them to remote areas. As an alternative, some low-cost products are available in the market that copes with this need, namely the D-Eye lens, which can be attached to a smartphone and enables the capture of fundus images, presenting as major drawback lower quality imaging when compared to professional equipment. Some techniques rely on video capture to perform summarization and build a full image with the desired features. In this context, the goal of this paper is to present a review of the methods that can perform video summarization and methods for glaucoma detection, combining both to indicate if individuals present glaucoma symptoms, as a pre-screening approach. © 2023, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.