Publications

Publications by Armando Sousa

2025

Balancing Speed and Accuracy: A Comparative Analysis of Segment Anything-Based Models for Robotic Indoor Semantic Mapping

Authors
Ferreira, BG; de Sousa, AJM; Reis, LP;

Publication
ICINCO (1)

Abstract
Semantic segmentation is a relevant process for creating the rich semantic maps required for indoor navigation by autonomous robots. While foundation models like Segment Anything Model (SAM) have significantly advanced the field by enabling object segmentation without prior references, selecting an efficient variant for real-time robotics applications remains a challenge due to the trade-off between performance and accuracy. This paper evaluates three such variants — FastSAM, MobileSAM, and SAM 2 — comparing their speed and accuracy to determine their suitability for semantic mapping tasks. The models were assessed within the Robot@VirtualHome dataset across 30 distinct scenes, with performance quantified using Frames Per Second (FPS), Precision, Recall, and an Over-Segmentation metric, which quantifies the fragmentation of an object into multiple masks, preventing high quality semantic segmentation. The results reveal distinct performance profiles: FastSAM achieves the highest speed but exhibits poor precision and significant mask fragmentation. Conversely, SAM 2 provides the highest precision but is computationally intensive for real-time applications. MobileSAM emerges as the most balanced model, delivering high recall, good precision, and viable processing speed, with minimal over-segmentation. We conclude that MobileSAM offers the most effective trade-off between segmentation quality and efficiency, making it a good candidate for indoor semantic mapping in robotics.

CloseRead Abstract

2026

Wheeled-Robot Navigation in Harsh Environments Using Deep Reinforcement Learning-Systematic Literature Review and Taxonomy

Authors
Mohamed, EMF; de Sousa, AJM; Dos Santos, FN;

Publication
IEEE ACCESS

Abstract
Wheeled mobile robots are increasingly deployed in harsh environments where dense obstacles, traps, variable terrain, soil effects, tight energy budgets, and sensor noise often deem classical navigation stacks insufficient. This paper presents a PRISMA-guided systematic review of recent work on Deep Reinforcement Learning (DRL) for wheeled ground-robot navigation in harsh environments and organizes the field via a practical six-dimensional taxonomy: environmental challenges, navigation architecture, observation modality, action strategy, action space, and learning algorithm. The taxonomy is refined through an iterative, evidence-grounded coding process on the included studies, and applied under a transparent coding protocol to support reproducible categorization. Across the literature, DRL appears both as a planner module as well as end-to-end policy (behavior) implementer tool. Regarding observation, mapless navigation based on LiDAR or cameras are prevalent. Actions are predicted mostly one time step ahead and are continuous. Actor-critic methods are prevalent, notably PPO and SAC are the common DRL methods used. As for the evaluation methodology, it remains largely simulation-based, with only limited sim-to-real protocols. Building on these findings, we use the previously mentioned taxonomy to identify common design choices for navigation in harsh terrains, propose minimum reporting practices to enable reproducible comparison, and propose research directions including energy-aware learning, improved robustness to sensor degradation, all weather soil-vehicle interaction modeling, short-horizon look-ahead for stability and smoothness, standardized tasks and metrics. The proposed taxonomy and guidelines, as well as identified trends, intend to help researchers and practitioners select methods that best suits their own objectives and constraints, thus hopefully accelerating progress from promising simulation results to dependable, field-ready autonomy.

CloseRead Abstract

2026

Fine-Tuning Lightweight LLMs With Human-Curated Data on Electrical Circuit Fundamentals for E-Learning

Authors
Rocha, A; Ferreira, J; Oliveira, P; Alves, M; Sousa, A;

Publication
COMPUTER APPLICATIONS IN ENGINEERING EDUCATION

Abstract
This study examines whether Parameter-Efficient Fine-Tuning (PEFT) of lightweight, free, and open-licensed Large Language Models (LLMs) can yield tutoring assistants for introductory circuit analysis methods, while fitting the students' needs. We analyzed 260 Electrical and Computer Engineering (ECE) exam responses to classify and quantify frequent students' mistakes when applying the Loop Current Method (LCM). Only 28.5% solved the target problem without error, and most difficulties were conceptual (e.g., miscounting the number of independent Kirchhoff's Voltage Law (KVL) equations). Driven by this taxonomy, we assembled official course materials and curated a bilingual (Portuguese-English) pedagogical dataset. Using GTP-4o for distillation, we generated question-answer (QA) pairs for fine-tuning smaller models (Meta Llama 3.2 1B and 3.1 8B), via Quantized Low-Rank Adaptation (QLoRA) on a single commodity GPU, with an end-to-end pipeline completing in under 7 min. A blind study involving 77 first-year ECE students evaluated responses to (never seen) questions from both our tuned models and GPT-4.5, rating correctness, clarity, educational value, task coverage, and style. The 8B model scored within one point (5-point Likert) of GPT-4.5 model and both 1B and 8B were consistently preferred over untuned baseline versions for clarity and task coverage. As a complementary cross-check, 12 higher education senior professors independently evaluated model responses, largely corroborating the student-based rankings. These results provide evidence that carefully curated documents introducing electrical circuit theory, combined with smaller models optimized with PEFT, namely QLoRA, can be used in the construction of a always-available tutoring application. The proposed system features modest cost, runs on consumer-grade hardware, and paves the way for deployable front-end applications that do not involve possibly expensive, resource-hungry, remote machines.

CloseRead Abstract