2025
Autores
Guimarães, M; Carneiro, D; Soares, L; Ribeiro, M; Loureiro, G;
Publicação
Advances in Information and Communication - Proceedings of the 2025 Future of Information and Communication Conference (FICC), Volume 1, Berlin, Germany, 27-28 April 2025.
Abstract
The interaction between humans and technology has always been a key determinant factor of adoption and efficiency. This is true whether the interaction is with hardware, software or data. In the particular case of Information Retrieval (IR), recent developments in Deep Learning and Natural Language Processing (NLP) techniques opened the door to more natural and efficient IR means, no longer based on keywords or similarity metrics but on a distributed representation of meaning. In this paper we propose an agent-based architecture to serve as an interface with industrial systems, in which agents are powered by specific Large Language Models (LLMs). Its main goal is to make the interaction with such systems (e.g. data sources, production systems, machines) natural, allowing users to execute complex tasks with simple prompts. To this end, key aspects considered in the architecture are human-centricity and context-awareness. This paper provides a high-level description of this architecture, and then focuses on the development and evaluation of one of its key agents, responsible for information retrieval. For this purpose, we detail three application scenarios, and evaluate the ability of this agent to select the appropriate data sources to answer a specific prompt. Depending on the scenario and on the underlying model, results show an accuracy of up to 80%, showing that the proposed agent can be used to autonomously select from among several available data sources to answer a specific information need. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
2025
Autores
Martinez-Rodrigo, A; Pedrosa, J; Carneiro, D; Cavero-Redondo, I; Saz-Lara, A;
Publicação
APPLIED SCIENCES-BASEL
Abstract
Arterial stiffness (AS) is a well-established predictor of cardiovascular events, including myocardial infarction and stroke. One of the most recognized methods for assessing AS is through arterial pulse wave velocity (aPWV), which provides valuable clinical insights into vascular health. However, its measurement typically requires specialized equipment, making it inaccessible in primary healthcare centers and low-resource settings. In this study, we developed and validated different machine learning models to estimate aPWV using common clinical markers routinely collected in standard medical examinations. Thus, we trained five regression models: Linear Regression, Polynomial Regression (PR), Gradient Boosting Regression, Support Vector Regression, and Neural Networks (NNs) on the EVasCu dataset, a cohort of apparently healthy individuals. A 10-fold cross-validation demonstrated that PR and NN achieved the highest predictive performance, effectively capturing nonlinear relationships in the data. External validation on two independent datasets, VascuNET (a healthy population) and ExIC-FEp (a cohort of cardiopathic patients), confirmed the robustness of PR and NN (R- (2)> 0.90) across different vascular conditions. These results indicate that by using easily accessible clinical variables and AI-driven insights, it is possible to develop a cost-effective tool for aPWV estimation, enabling early cardiovascular risk stratification in underserved and rural areas where specialized AS measurement devices are unavailable.
2025
Autores
Peixoto, E; Torres, D; Carneiro, D; Silva, B; Marques, R;
Publicação
BIG DATA AND COGNITIVE COMPUTING
Abstract
The rapid integration of Machine Learning (ML) in organizational practices has driven demand for substantial computational resources, incurring both high economic costs and environmental impact, particularly from energy consumption. This challenge is amplified in dynamic data environments, where ML models must be frequently retrained to adapt to evolving data patterns. To address this, more sustainable Machine Learning Operations (MLOps) pipelines are needed for reducing environmental impacts while maintaining model accuracy. In this paper, we propose a model reuse approach based on data similarity metrics, which allows organizations to leverage previously trained models where applicable. We introduce a tailored set of meta-features to characterize data windows, enabling efficient similarity assessment between historical and new data. The effectiveness of the proposed method is validated across multiple ML tasks using the cosine and Bray-Curtis distance functions, which evaluate both model reuse rates and the performance of reused models relative to newly trained alternatives. The results indicate that the proposed approach can reduce the frequency of model retraining by up to 70% to 90% while maintaining or even improving predictive performance, contributing to more resource-efficient and sustainable MLOps practices.
2016
Autores
Paulo Novais; Davide Carneiro;
Publicação
Abstract
2024
Autores
Palumbo, G; Carneiro, D; Alves, V;
Publicação
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS
Abstract
The field of AI Ethics has recently gained considerable attention, yet much of the existing academic research lacks practical and objective contributions for the development of ethical AI systems. This systematic literature review aims to identify and map objective metrics documented in literature between January 2018 and June 2023, specifically focusing on the ethical principles outlined in the Ethics Guidelines for Trustworthy AI. The review was based on 66 articles retrieved from the Scopus and World of Science databases. The articles were categorized based on their alignment with seven ethical principles: Human Agency and Oversight, Technical Robustness and Safety, Privacy and Data Governance, Transparency, Diversity, Non-Discrimination and Fairness, Societal and Environmental Well-being, and Accountability. Of the identified articles, only a minority presented objective metrics to assess AI ethics, with the majority being purely theoretical works. Moreover, existing metrics are primarily concentrating on Diversity, Non-Discrimination and Fairness, with a clear under-representation of the remaining principles. This lack of practical contributions makes it difficult for Data Scientists to devise systems that can be deemed Ethical, or to monitor the alignment of existing systems with current guidelines and legislation. With this work, we lay out the current panorama concerning objective metrics to quantify AI Ethics in Data Science and highlight the areas in which future developments are needed to align Data Science projects with the human values widely posited in the literature.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.