Publicacoes - INESC TEC

Publicações

2025

Machine Learning Models for Indoor Positioning Using Bluetooth RSSI and Video Data: A Case Study

Autores
Mamede, T; Silva, N; Marques, ERB; Lopes, LMB;

Publicação
SENSORS

Abstract
Indoor Positioning Systems (IPSs) are essential for applications requiring accurate location awareness in indoor environments. However, achieving high precision remains challenging due to signal interference and environmental variability. This study proposes a multimodal IPS that integrates Bluetooth Received Signal Strength Indicator (RSSI) measurements and video imagery using machine learning (ML) and ensemble learning techniques. The system was implemented and deployed in the Hall of Biodiversity at the Natural History and Science Museum of the University of Porto. The venue presented significant deployment issues, namely restrictions on beacon placement and lighting conditions. We trained independent ML models on RSSI and video datasets, and combined them through ensemble learning methods. The experimental results from test scenarios, which included simulated visitor trajectories, showed that ensemble models consistently outperformed the RSSI-based and video-based models. These findings demonstrate that the use of multimodal data can significantly improve IPS accuracy despite constraints such as multipath interference, low lighting, and limited beacon infrastructure. Overall, they highlight the potential of multimodal data for deployments in complex indoor environments.

FecharLer Abstract

2025

Automatic Generation of Loop Invariants in Dafny with Large Language Models

Autores
Faria, JP; Trigo, E; Abreu, R;

Publicação
FUNDAMENTALS OF SOFTWARE ENGINEERING, FSEN 2025

Abstract
Recent verification tools aim to make formal verification more accessible for software engineers by automating most of the verification process. However, the manual work and expertise required to write verification helper code, such as loop invariants and auxiliary lemmas and assertions, remains a barrier. This paper explores the use of Large Language Models (LLMs) to automate the generation of loop invariants for programs in Dafny. We tested the approach on a curated dataset of 100 programs in Dafny involving arrays, strings, and numeric types. Using a multimodel approach that combines GPT-4o and Claude 3.5 Sonnet, correct loop invariants (passing the Dafny verifier) were generated at the first attempt for 92% of the programs, and in at most five attempts for 95% of the programs. Additionally, we developed an extension to the Dafny plugin for Visual Studio Code to incorporate automatic loop invariant generation into the IDE. Our work stands out from related approaches by handling a broader class of problems and offering IDE integration.

FecharLer Abstract

2025

A new parametric information-gain criterion for tree-based machine learning algorithms

Autores
Costa, D; Costa, VV; Rocha, E;

Publicação
PEERJ COMPUTER SCIENCE

Abstract
Decision Trees (DTs) remain one of the most important algorithms in machine learning for their simplicity, interpretability, and often satisfactory performance. Furthermore, they are critical foundational components for more performant models such as Random Forests (RFs) and Gradient Boosted Trees. Central to DTs is the splitting process, where data is partitioned according to criteria traditionally based on information-theoretic measures such as Shannon entropy or Gini index. In this article, we propose a novel parametric entropy-based information gain criterion designed to generalize and extend classical entropic measures to improve classification performance in DTs and RFs. We introduce a five-parameter entropy formulation capable of replicating and extending known entropy measures. This new criterion was incorporated into DT and RF classifiers and evaluated on a collection of 18 benchmarking datasets, including both synthetic and real-world data retrieved from publicly available repositories. Performance was assessed using 5-fold cross-validation and optimized via Bayesian hyperparameter search, with weighted F1-score as the primary metric. Compared to splitting criteria based on existing entropy/purity measures (e.g., Gini, Shannon, R & eacute;nyi, and Tsallis), our method yielded statistically significant improvements in classification performance across most datasets. On multiclass and imbalanced datasets, such as the Wine Quality dataset, F1-score improvements exceeded 40% using RF algorithms. Bayesian signed-rank tests confirmed the robustness of our method, which never underperformed relative to standard approaches. The proposed entropy-based splitting criterion offers a flexible and effective alternative to classical information-theoretic measures, delivering improvements in classification performance.

FecharLer Abstract

2025

Geo-Indistinguishability

Autores
Mendes, R; Vilela, P;

Publicação
Encyclopedia of Cryptography, Security and Privacy, Third Edition

Abstract
[No abstract available]

FecharLer Abstract

2025

Understanding Squeeze-and-Excitation Layers for Medical Image Segmentation

Autores
Martins, ML; Coimbra, MT; Renna, F;

Publicação
EUSIPCO

Abstract
The U-Net is one of the most fundamental architectural advancements in the deep learning era. It is a crucial tool for image segmentation, especially for biomedical modalities. The research community seems to interpret the effectiveness of neural architectural search (such as the nn-U-Net) as evidence that architectural enhancements proposed since its debut are mostly unnecessary. We argue that there are still network-in-network primitives that can be leveraged to further enhance its performance, focusing on the squeeze-and-excitation (SE) pathway specifically in this paper. Specifically, we study its use of global descriptors, since it should be at odds with the spatial resolution required for dense-prediction tasks. It is theorized in the literature that performance is probably gained from some implicit ability of the learned excitations to filter supposedly uninformative channels during training. We explain this almost unreasonable success through an analysis of the empirical estimates of the excitation covariance matrix. Our analysis also directly contradicts the above conjecture - the most effective SE approach actually displayed the less extreme filtering behaviour, weighing all channels much closer to the mean (0.5). Our experiments are conducted in three diverse, staple biomedical modalities: dermoscopy, colonoscopy, and ultrasound.

FecharLer Abstract

2025

Radiogenomic Insights from a Portuguese Lung Cancer Cohort: Foundations for Predictive Modeling

Autores
Neves, I; Freitas, C; Lemos, C; Oliveira, HP; Hespanhol, V; França, M; Pereira, T;

Publicação
Measurement and Evaluations in Cancer Care

Abstract

136
4496