Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

2026

Comparative Analysis of CNNs and Vision Transformers for Lesion Classification in Capsule Endoscopy

Autores
Tabosa, C; Salgado, M; Leite, D; Cunha, A;

Publicação
Procedia Computer Science

Abstract
Video capsule endoscopy (VCE) enables high-resolution visualisation of the small bowel but remains constrained by manual review of thousands of frames, which is time-consuming and error-prone under class imbalance. This study investigates deep learning for automatic multiclass lesion classification in VCE, comparing two convolutional networks (ResNet-50, EfficientNet-B3) with two Vision Transformers (Swin, DeiT) on the public Kvasir-Capsule dataset (47,161 images; 11 classes). The pipeline comprises standard preprocessing, class-aware augmentation and adaptive data augmentation, stratified data partitioning, hyperparameter optimisation with Optuna, and evaluation using accuracy, precision, recall, and F1-score. DeiT achieved the best overall performance (accuracy = 0.98; F1 = 0.96), with strong class-wise results in clinically salient categories (e.g., ulcer, fresh blood, angiectasia), indicating effective modelling of long-range dependencies and subtle patterns. We further assess computational feasibility by reporting training configuration and indicative inference time per image, supporting potential integration into assisted reading workflows. Limitations include reliance on a single public dataset, pronounced class imbalance, and the absence of prospective clinical validation, which may affect generalisability. These findings position Transformer-based models as promising candidates for VCE decision support, while underscoring the need for future work on (i) multicentric datasets and external validation, (ii) comprehensive statistical analysis with confidence intervals and robust baselines under imbalance, and (iii) prospective studies quantifying end-to-end impact on reading time and diagnostic safety. © 2025 The Authors. Published by Elsevier B.V.

2026

Enhancing IoMT Security by Using Benford's Law and Distance Functions

Autores
Fernandes, P; Ciardhuáin, SO; Antunes, M;

Publicação
PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2025, PT I

Abstract
The increasing connectivity of Internet of Medical Things (IoMT) devices has accentuated their susceptibility to cyberattacks. The sensitive data they handle makes them prime targets for information theft and extortion, while outdated and insecure communication protocols further elevate security risks. This paper presents a lightweight and innovative approach that combines Benford's law with statistical distance functions to detect attacks in IoMT devices. The methodology uses Benford's law to analyze digit frequency and classify IoMT devices traffic as benign or malicious, regardless of attack type. It employs distance-based statistical functions like Jensen-Shannon divergence, KullbackLeibler divergence, Pearson correlation, and the Kolmogorov test to detect anomalies. Experimental validation was conducted on the CIC-IoMT-2024 benchmark dataset, comprising 45 features and multiple attack types. The best performance was achieved with the Kolmogorov test (alpha = 0.01), particularly in DoS ICMP attacks, yielding a precision of.99.24%, a recall of.98.73%, an F1 score of.98.97%, and an accuracy of.97.81%. Jensen-Shannon divergence also performed robustly in detecting SYN-based attacks, demonstrating strong detection with minimal computational cost. These findings confirm that Benford's law, when combined with well-chosen statistical distances, offers a viable and efficient alternative to machine learning models for anomaly detection in constrained environments like IoMT.

2026

Comparative Evaluation of MoE and HMoE for Multiclass Classification in VCE Image Analysis

Autores
Costa, T; Castro, J; Salgado, M; Cunha, A;

Publicação
Procedia Computer Science

Abstract
Video Capsule Endoscopy (VCE) is a pivotal technology in modern gastroenterology, offering a non-invasive method to visualize the entire small bowel. However, the clinical application of VCE is hampered by the extensive review time required, as specialists must manually analyze thousands of images from each procedure. This process is not only laborious and costly but also prone to diagnostic errors due to fatigue, subtle abnormalities, and variability in interpretation across clinicians. To address this challenge, deep learning methods have been explored to automate VCE image analysis. However, most existing approaches rely on a single model architecture, which often fails to generalize across the broad visual diversity found in gastrointestinal imagery. This limitation becomes especially pronounced in multiclass classification tasks, where the ability to distinguish between visually similar tissues and lesions is essential. Ensemble-based methods such as Mixture of Experts (MoE) have shown promising results in general computer vision by leveraging multiple specialized models for improved robustness. However, no prior work has investigated MoE or Hierarchical MoE (HMoE) architectures for multiclass classification of VCE or endoscopic images more broadly. To explore this opportunity, we present a comparative framework evaluating three deep learning strategies for VCE image classification: individual models, flat MoE systems, and Hierarchical MoE architectures. Using a subset of the Kvasir-Capsule dataset, which contains 12 gastrointestinal tissue and lesion classes, we first train and evaluate four backbone models (InceptionNeXt, EfficientViT, ConvNeXtV2, and DeiT3) to establish a performance baseline. The two best-performing architectures, ConvNeXtV2 and DeiT3, are then used as expert backbones within both MoE and HMoE systems. In the MoE configuration, a gating network assigns dynamic per-image weights to multiple expert instances. In contrast, the HMoE configuration constructs a learned binary tree that routes samples based on class similarity through increasingly specialized branches. In the HMoE models, ConvNeXtV2 outperformed DeiT3 in accuracy, whereas DeiT3 showed superior routing accuracy. These results indicate that expert-driven ensemble methods not only outperform standalone models but also offer complementary advantages depending on architecture and routing strategy. This study provides new evidence for the clinical potential of MoE and HMoE frameworks in scalable, accurate VCE image analysis. © 2025 The Authors. Published by Elsevier B.V.

2026

Predictors for decision-making in collaborative robots adoption: evidence from the Brazilian manufacturing industry

Autores
de Sousa, PR; Bronzo, M; Torres, NT Jr; Vivaldini, M; Simoes, AC; de Jesus, TS; Couto, G;

Publicação
OPERATIONS MANAGEMENT RESEARCH

Abstract
As collaborative robots increasingly redefine industrial automation, understanding the factors that drive their adoption is essential to operations management. This study examines the main drivers of collaborative robot adoption in the Brazilian manufacturing sector by combining theory-driven framing with a machine learning classification approach. It was developed a Random Forest classifier to identify the strongest predictors of cobot adoption and to rank their relative importance. Data were collected from a sample of respondents-primarily managers and chief executive officers-representing 300 industrial companies. Grounded in the Technology-Organization-Environment (TOE) framework and complemented by Diffusion of Innovations (DoI) and Institutional (INT) perspectives, the analysis shows that technological advantages, namely space efficiency, cost reduction, and ease of integration, are critical drivers of adoption. Organizational factors, including proactive managerial involvement and alignment with an innovation-oriented culture, significantly increase the likelihood of collaborative robot uptake. The model demonstrated robust predictive performance and produced interpretable variable importance scores that confirm the relative influence of technological and managerial factors. These findings provide a structured lens for understanding and guiding managerial decision-making on cobot adoption and translate into practical recommendations for managers.

2026

An Optimized Multi-class Classification for Industrial Control Systems

Autores
Palma, A; Antunes, M; Alves, A;

Publicação
PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2025, PT I

Abstract
Ensuring the security of Industrial Control Systems (ICS) is increasingly critical due to increasing connectivity and cyber threats. Traditional security measures often fail to detect evolving attacks, necessitating more effective solutions. This paper evaluates machine learning (ML) methods for ICS cybersecurity, using the ICS-Flow dataset and Optuna for hyperparameter tuning. The selected models, namely Random Forest (RF), AdaBoost, XGBoost, Deep Neural Networks, Artificial Neural Networks, ExtraTrees (ET), and Logistic Regression, are assessed using macro-averaged F1-score to handle class imbalance. Experimental results demonstrate that ensemble-based methods (RF, XGBoost, and ET) offer the highest overall detection performance, particularly in identifying commonly occurring attack types. However, minority classes, such as IP-Scan, remain difficult to detect accurately, indicating that hyperparameter tuning alone is insufficient to fully deal with imbalanced ICS data. These findings highlight the importance of complementary measures, such as focused feature selection, to enhance classification capabilities and protect industrial networks against a wider array of threats.

2026

Automatic Optic Nerve Segmentation in Retinal Photographs for Glaucoma Detection Using Convolutional Neural Network

Autores
Machado, C; Pereira, P; Ferreira, M; Braz, G; Correia, N; Cunha, A;

Publicação
Procedia Computer Science

Abstract
Glaucoma is one of the leading causes of irreversible blindness worldwide, affecting millions of people, often silently and progressively. Early diagnosis is crucial to slow its progression, but it remains challenging due to the need for manual analysis of large volumes of retinal images by trained specialists. In this context, automatic detection systems based on deep learning offer a promising opportunity to facilitate and accelerate the diagnostic process, providing scalability and high accuracy. This work presents the development of an automatic method for optic disc and optic cup segmentation in retinal fundus photographs, aiming to support early glaucoma detection. The proposed methodology is based on convolutional neural networks (CNNs), specifically an enhanced U-Net architecture with a ResNet50 backbone, incorporating attention mechanisms and data augmentation strategies to improve segmentation accuracy. The model was trained and validated using the REFUGE dataset, which contains high-quality fundus images with manual annotations of the disc and cup regions. Experimental results demonstrate that the developed model achieved an average Dice coefficient of 0.937 for optic disc segmentation and 0.828 for optic cup segmentation. Analysis of the cup-to-disc ratio (CDR) yielded mean values of VCDR = 0.497 ± 0.059, ACDR = 0.252 ± 0.060, and mean CDR = 0.375 ± 0.058, with 55.0% of cases classified as low risk, 43.3% as moderate risk, and 1.7% as high risk for glaucoma. These results highlight the potential of the proposed method as an assistive tool for automated glaucoma screening. © 2025 The Authors. Published by Elsevier B.V.

  • 3
  • 4503