2026
Autores
Tabosa, C; Salgado, M; Leite, D; Cunha, A;
Publicação
Procedia Computer Science
Abstract
Video capsule endoscopy (VCE) enables high-resolution visualisation of the small bowel but remains constrained by manual review of thousands of frames, which is time-consuming and error-prone under class imbalance. This study investigates deep learning for automatic multiclass lesion classification in VCE, comparing two convolutional networks (ResNet-50, EfficientNet-B3) with two Vision Transformers (Swin, DeiT) on the public Kvasir-Capsule dataset (47,161 images; 11 classes). The pipeline comprises standard preprocessing, class-aware augmentation and adaptive data augmentation, stratified data partitioning, hyperparameter optimisation with Optuna, and evaluation using accuracy, precision, recall, and F1-score. DeiT achieved the best overall performance (accuracy = 0.98; F1 = 0.96), with strong class-wise results in clinically salient categories (e.g., ulcer, fresh blood, angiectasia), indicating effective modelling of long-range dependencies and subtle patterns. We further assess computational feasibility by reporting training configuration and indicative inference time per image, supporting potential integration into assisted reading workflows. Limitations include reliance on a single public dataset, pronounced class imbalance, and the absence of prospective clinical validation, which may affect generalisability. These findings position Transformer-based models as promising candidates for VCE decision support, while underscoring the need for future work on (i) multicentric datasets and external validation, (ii) comprehensive statistical analysis with confidence intervals and robust baselines under imbalance, and (iii) prospective studies quantifying end-to-end impact on reading time and diagnostic safety. © 2025 The Authors. Published by Elsevier B.V.
2026
Autores
Fernandes, P; Ciardhuáin, SO; Antunes, M;
Publicação
PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2025, PT I
Abstract
The increasing connectivity of Internet of Medical Things (IoMT) devices has accentuated their susceptibility to cyberattacks. The sensitive data they handle makes them prime targets for information theft and extortion, while outdated and insecure communication protocols further elevate security risks. This paper presents a lightweight and innovative approach that combines Benford's law with statistical distance functions to detect attacks in IoMT devices. The methodology uses Benford's law to analyze digit frequency and classify IoMT devices traffic as benign or malicious, regardless of attack type. It employs distance-based statistical functions like Jensen-Shannon divergence, KullbackLeibler divergence, Pearson correlation, and the Kolmogorov test to detect anomalies. Experimental validation was conducted on the CIC-IoMT-2024 benchmark dataset, comprising 45 features and multiple attack types. The best performance was achieved with the Kolmogorov test (alpha = 0.01), particularly in DoS ICMP attacks, yielding a precision of.99.24%, a recall of.98.73%, an F1 score of.98.97%, and an accuracy of.97.81%. Jensen-Shannon divergence also performed robustly in detecting SYN-based attacks, demonstrating strong detection with minimal computational cost. These findings confirm that Benford's law, when combined with well-chosen statistical distances, offers a viable and efficient alternative to machine learning models for anomaly detection in constrained environments like IoMT.
2026
Autores
Costa, T; Castro, J; Salgado, M; Cunha, A;
Publicação
Procedia Computer Science
Abstract
Video Capsule Endoscopy (VCE) is a pivotal technology in modern gastroenterology, offering a non-invasive method to visualize the entire small bowel. However, the clinical application of VCE is hampered by the extensive review time required, as specialists must manually analyze thousands of images from each procedure. This process is not only laborious and costly but also prone to diagnostic errors due to fatigue, subtle abnormalities, and variability in interpretation across clinicians. To address this challenge, deep learning methods have been explored to automate VCE image analysis. However, most existing approaches rely on a single model architecture, which often fails to generalize across the broad visual diversity found in gastrointestinal imagery. This limitation becomes especially pronounced in multiclass classification tasks, where the ability to distinguish between visually similar tissues and lesions is essential. Ensemble-based methods such as Mixture of Experts (MoE) have shown promising results in general computer vision by leveraging multiple specialized models for improved robustness. However, no prior work has investigated MoE or Hierarchical MoE (HMoE) architectures for multiclass classification of VCE or endoscopic images more broadly. To explore this opportunity, we present a comparative framework evaluating three deep learning strategies for VCE image classification: individual models, flat MoE systems, and Hierarchical MoE architectures. Using a subset of the Kvasir-Capsule dataset, which contains 12 gastrointestinal tissue and lesion classes, we first train and evaluate four backbone models (InceptionNeXt, EfficientViT, ConvNeXtV2, and DeiT3) to establish a performance baseline. The two best-performing architectures, ConvNeXtV2 and DeiT3, are then used as expert backbones within both MoE and HMoE systems. In the MoE configuration, a gating network assigns dynamic per-image weights to multiple expert instances. In contrast, the HMoE configuration constructs a learned binary tree that routes samples based on class similarity through increasingly specialized branches. In the HMoE models, ConvNeXtV2 outperformed DeiT3 in accuracy, whereas DeiT3 showed superior routing accuracy. These results indicate that expert-driven ensemble methods not only outperform standalone models but also offer complementary advantages depending on architecture and routing strategy. This study provides new evidence for the clinical potential of MoE and HMoE frameworks in scalable, accurate VCE image analysis. © 2025 The Authors. Published by Elsevier B.V.
2026
Autores
Pilarski, L; Silva, T; Filipe, V; Pinto, T; Barroso, J; Oliveira, AS; Lima, J;
Publicação
Lecture Notes in Networks and Systems
Abstract
This article presents a real-time object detection and distance estimation system implemented on a low-cost platform. The system uses a Raspberry Pi 5 and two cameras in a stereoscopic configuration to capture pairs of images. Object detection is performed using YOLO neural networks and distance estimation is based on the disparity between the centers of the detected bounding boxes. The system is evaluated in terms of detection performance, inference speed and depth estimation accuracy. Three YOLO models (YOLOv8n, YOLO11n and YOLO12n) are tested at different resolutions. Among them, the YOLO11n with a resolution of 320×320 achieves the best balance between processing speed and detection quality in stereoscopic operation. The system has a low error in depth estimation at close range, with absolute errors of less than 1.2 cm up to 60 cm. At greater distances, accuracy is affected by the reduction in the size of the bounding box, which limits the reliability of the disparity. Possible improvements include using segmentation-based localization and optimizing the stereo configuration. The proposed system is suitable for short-range applications in controlled environments and serves as a basis for future improvements in embedded vision systems. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
2026
Autores
de Sousa, PR; Bronzo, M; Torres, NT Jr; Vivaldini, M; Simoes, AC; de Jesus, TS; Couto, G;
Publicação
OPERATIONS MANAGEMENT RESEARCH
Abstract
As collaborative robots increasingly redefine industrial automation, understanding the factors that drive their adoption is essential to operations management. This study examines the main drivers of collaborative robot adoption in the Brazilian manufacturing sector by combining theory-driven framing with a machine learning classification approach. It was developed a Random Forest classifier to identify the strongest predictors of cobot adoption and to rank their relative importance. Data were collected from a sample of respondents-primarily managers and chief executive officers-representing 300 industrial companies. Grounded in the Technology-Organization-Environment (TOE) framework and complemented by Diffusion of Innovations (DoI) and Institutional (INT) perspectives, the analysis shows that technological advantages, namely space efficiency, cost reduction, and ease of integration, are critical drivers of adoption. Organizational factors, including proactive managerial involvement and alignment with an innovation-oriented culture, significantly increase the likelihood of collaborative robot uptake. The model demonstrated robust predictive performance and produced interpretable variable importance scores that confirm the relative influence of technological and managerial factors. These findings provide a structured lens for understanding and guiding managerial decision-making on cobot adoption and translate into practical recommendations for managers.
2026
Autores
Palma, A; Antunes, M; Alves, A;
Publicação
PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2025, PT I
Abstract
Ensuring the security of Industrial Control Systems (ICS) is increasingly critical due to increasing connectivity and cyber threats. Traditional security measures often fail to detect evolving attacks, necessitating more effective solutions. This paper evaluates machine learning (ML) methods for ICS cybersecurity, using the ICS-Flow dataset and Optuna for hyperparameter tuning. The selected models, namely Random Forest (RF), AdaBoost, XGBoost, Deep Neural Networks, Artificial Neural Networks, ExtraTrees (ET), and Logistic Regression, are assessed using macro-averaged F1-score to handle class imbalance. Experimental results demonstrate that ensemble-based methods (RF, XGBoost, and ET) offer the highest overall detection performance, particularly in identifying commonly occurring attack types. However, minority classes, such as IP-Scan, remain difficult to detect accurately, indicating that hyperparameter tuning alone is insufficient to fully deal with imbalanced ICS data. These findings highlight the importance of complementary measures, such as focused feature selection, to enhance classification capabilities and protect industrial networks against a wider array of threats.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.