Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

2025

A Framework Leveraging Large Language Models for Autonomous UAV Control in Flying Networks

Autores
Nunes, D; Amorim, R; Ribeiro, P; Coelho, A; Campos, R;

Publicação
2025 IEEE INTERNATIONAL MEDITERRANEAN CONFERENCE ON COMMUNICATIONS AND NETWORKING, MEDITCOM

Abstract
This paper proposes FLUC, a modular framework that integrates open-source Large Language Models (LLMs) with Unmanned Aerial Vehicle (UAV) autopilot systems to enable autonomous control in Flying Networks (FNs). FLUC translates high-level natural language commands into executable UAV mission code, bridging the gap between operator intent and UAV behaviour. FLUC is evaluated using three open-source LLMs - Qwen 2.5, Gemma 2, and LLaMA 3.2 - across scenarios involving code generation and mission planning. Results show that Qwen 2.5 excels in multi-step reasoning, Gemma 2 balances accuracy and latency, and LLaMA 3.2 offers faster responses with lower logical coherence. A case study on energy-aware UAV positioning confirms FLUC's ability to interpret structured prompts and autonomously execute domain-specific logic, showing its effectiveness in real-time, mission-driven control.

2025

Deep Learning-Driven Integration of Multimodal Data for Material Property Predictions

Autores
Costa, V; Oliveira, JM; Ramos, P;

Publicação
COMPUTATION

Abstract
Advancements in deep learning have revolutionized materials discovery by enabling predictive modeling of complex material properties. However, single-modal approaches often fail to capture the intricate interplay of compositional, structural, and morphological characteristics. This study introduces a novel multimodal deep learning framework for enhanced material property prediction, integrating textual (chemical compositions), tabular (structural descriptors), and image-based (2D crystal structure visualizations) modalities. Utilizing the Alexandriadatabase, we construct a comprehensive multimodal dataset of 10,000 materials with symmetry-resolved crystallographic data. Specialized neural architectures, such as FT-Transformer for tabular data, Hugging Face Electra-based model for text, and TIMM-based MetaFormer for images, generate modality-specific embeddings, fused through a hybrid strategy into a unified latent space. The framework predicts seven critical material properties, including electronic (band gap, density of states), thermodynamic (formation energy, energy above hull, total energy), magnetic (magnetic moment per volume), and volumetric (volume per atom) features, many governed by crystallographic symmetry. Experimental results demonstrated that multimodal fusion significantly outperforms unimodal baselines. Notably, the bimodal integration of image and text data showed significant gains, reducing the Mean Absolute Error for band gap by approximately 22.7% and for volume per atom by 22.4% compared to the average unimodal models. This combination also achieved a 28.4% reduction in Root Mean Squared Error for formation energy. The full trimodal model (tabular + images + text) yielded competitive, and in several cases the lowest, error metrics, particularly for band gap, magnetic moment per volume and density of states per atom, confirming the value of integrating all three modalities. This scalable, modular framework advances materials informatics, offering a powerful tool for data-driven materials discovery and design.

2025

Fuzzy Logic Estimation of Coincidence Factors for EV Fleet Charging Infrastructure Planning in Residential Buildings

Autores
Carvalhosa, S; Ferreira, JR; Araújo, RE;

Publicação
ENERGIES

Abstract
As electric vehicle (EV) adoption accelerates, residential buildings-particularly multi-dwelling structures-face increasing challenges to electrical infrastructure, notably due to conservative sizing practices of electrical feeders based on maximum simultaneous demand. Current sizing methods assume all EVs charge simultaneously at maximum capacity, resulting in unnecessarily oversized and costly electrical installations. This study proposes an optimized methodology to estimate accurate coincidence factors, leveraging simulations of EV user charging behaviors in multi-dwelling residential environments. Charging scenarios considering different fleet sizes (1 to 70 EVs) were simulated under two distinct premises of charging: minimization of current allocation to achieve the desired battery state-of-charge and maximization of instantaneous power delivery. Results demonstrate significant deviations from conventional assumptions, with estimated coincidence factors decreasing non-linearly as fleet size increases. Specifically, applying the derived coincidence factors can reduce feeder section requirements by up to 86%, substantially lowering material costs. A fuzzy logic inference model is further developed to refine these estimates based on fleet characteristics and optimization preferences, providing a practical tool for infrastructure planners. The results were compared against other studies and real-life data. Finally, the proposed methodology thus contributes to more efficient, cost-effective design strategies for EV charging infrastructures in residential buildings.

2025

A Tripartite Framework for Immersive Music Production: Concepts and Methodologies

Autores
Barboza, JR; Bernardes, G; Magalhães, E;

Publicação
2025 Immersive and 3D Audio: from Architecture to Automotive (I3DA)

Abstract
Music production has long been characterized by well-defined concepts and techniques. However, a notable gap exists in applying these established principles to music production within immersive media. This paper addresses this gap by examining post-production processes applied to three case studies, i.e., three songs with unique instrumental features and narratives. The primary objective is to facilitate an in-depth analysis of technical and artistic challenges in musical production for immersive media. From a detailed analysis of technical and artistic post-production decisions in the three case studies and a critical examination of theories and techniques from sound design and music production, we propose a framework with a tripartite mixing categorization for immersive media: Traditional Production, Expanded Traditional Production, and Nontraditional Production. These concepts expand music production methodologies in the context of immersive media, offering a framework for understanding the complexities of spatial audio. By exploring these interdisciplinary connections, we aim to enrich the discourse surrounding music production, rethinking its conceptual plane into more integrative media practices outside the core music production paradigm, thus contributing to developing innovative production methodologies. © 2025 IEEE.

2025

Automatic Visual Inspection for Industrial Application

Autores
Ribeiro, AG; Vilaça, L; Costa, C; da Costa, TS; Carvalho, PM;

Publicação
JOURNAL OF IMAGING

Abstract
Quality control represents a critical function in industrial environments, ensuring that manufactured products meet strict standards and remain free from defects. In highly regulated sectors such as the pharmaceutical industry, traditional manual inspection methods remain widely used. However, these are time-consuming and prone to human error, and they lack the reliability required for large-scale operations, highlighting the urgent need for automated solutions. This is crucial for industrial applications, where environments evolve and new defect types can arise unpredictably. This work proposes an automated visual defect detection system specifically designed for pharmaceutical bottles, with potential applicability in other manufacturing domains. Various methods were integrated to create robust tools capable of real-world deployment. A key strategy is the use of incremental learning, which enables machine learning models to incorporate new, unseen data without full retraining, thus enabling adaptation to new defects as they appear, allowing models to handle rare cases while maintaining stability and performance. The proposed solution incorporates a multi-view inspection setup to capture images from multiple angles, enhancing accuracy and robustness. Evaluations in real-world industrial conditions demonstrated high defect detection rates, confirming the effectiveness of the proposed approach.

2025

Histopathological Imaging Dataset for Oral Cancer Analysis: A Study with a Data Leakage Warning

Autores
Nogueira, DM; Gomes, EF;

Publicação
Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2025 - Volume 1, Porto, Portugal, February 20-22, 2025.

Abstract

  • 33
  • 4353