Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

2025

A case study on phishing detection with a machine learning net

Authors
Bezerra, A; Pereira, I; Rebelo, MA; Coelho, D; de Oliveira, DA; Costa, JFP; Cruz, RPM;

Publication
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS

Abstract
Phishing attacks aims to steal sensitive information and, unfortunately, are becoming a common practice on the web. Email phishing is one of the most common types of attacks on the web and can have a big impact on individuals and enterprises. There is still a gap in prevention when it comes to detecting phishing emails, as new attacks are usually not detected. The goal of this work was to develop a model capable of identifying phishing emails based on machine learning approaches. The work was performed in collaboration with E-goi, a multi-channel marketing automation company. The data consisted of emails collected from the E-goi servers in the electronic mail format. The problem consisted of a classification problem with unbalanced classes, with the minority class corresponding to the phishing emails and having less than 1% of the total emails. Several models were evaluated after careful data selection and feature extraction based on the email content and the literature regarding these types of problems. Due to the imbalance present in the data, several sampling methods based on under-sampling techniques were tested to see their impact on the model's ability to detect phishing emails. The final model consisted of a neural network able to detect more than 80% of phishing emails without compromising the remaining emails sent by E-goi clients.

2025

Comparing Higher Education Rankings with Social Media Posting Strategies

Authors
Rocha, B; Figueira, A;

Publication
ASONAM (3)

Abstract
In the competitive landscape of higher education, institutions increasingly rely on international rankings to secure funding, attract talent, and enhance their global reputation. Concurrently, these institutions have expanded their presence on social media, utilizing sophisticated posting strategies not only to disseminate information but also to boost recognition and engagement. This study examines the relationship between the rankings of Higher Education Institutions (HEIs) and their social media posting strategies. We collected and analyzed tweets from 22 HEIs featured in a consolidated ranking system, focusing on various features of their social media posts. The analysis identified six distinct clusters of posting strategies. This paper categorizes the HEIs into these clusters and discusses the implications of differing social media strategies on their rankings. The findings suggest a nuanced interaction between social media engagement and the perceived prestige of HEIs. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.

2025

Uncertainties and Emerging Uses of Human-Ai Medical Diagnosis in Collaborative Clinical Practice

Authors
Correia, A; Fonseca, B; Schneider, D; Chaves, R; Kärkkäinen, T;

Publication
ISMSIT 2025 - 9th International Symposium on Multidisciplinary Studies and Innovative Technologies, Proceedings

Abstract
This paper discusses some recent developments in collaborative healthcare research considering settings where human clinicians collaborate through or interact with artificial intelligence (AI)-enabled systems to enhance clinical diagnosis, treatment procedures, and decision-making practices. Through a detailed examination of the potential gaps, implications, and challenges for health professionals and patients, this work explores typical AI-based collaborative clinical workflows and infrastructures that involve tasks such as patient data analysis, medical imaging, and event prediction. A brief synopsis of published research reveals inherent sociotechnical barriers concerning interoperability, data scarcity, bias amplification, trust, and transparency. It also highlights risks related to inadequate model and interface design, the oversimplification of clinical processes (e.g., lack of shared situational awareness), institutional misalignment (e.g., cultural norms and practices shaping how clinicians coordinate their efforts and make decisions based on AI recommendations), and commercial data manipulation that threatens patient care. © 2025 IEEE.

2025

Estimating Biomass in Eucalyptus globulus and Pinus pinaster Forests Using UAV-Based LiDAR in Central and Northern Portugal

Authors
Ferreira, L; Sandim, ASD; Lopes, DA; Sousa, JJ; Lopes, DMM; Silva, MECM; Padua, L;

Publication
LAND

Abstract
Accurate biomass estimation is important for forest management and climate change mitigation. This study evaluates the potential of using LiDAR (Light Detection and Ranging) data, acquired through Unmanned Aerial Vehicles (UAVs), for estimating above-ground and total biomass in Eucalyptus globulus and Pinus pinaster stands in central and northern Portugal. The acquired LiDAR point clouds were processed to extract structural metrics such as canopy height, crown area, canopy density, and volume. A multistep variable selection procedure was applied to reduce collinearity and select the most informative predictors. Multiple linear regression (MLR) models were developed and validated using field inventory data. Random Forest (RF) models were also tested for E. globulus, enabling a comparative evaluation between parametric and machine learning regression models. The results show that the 25th height percentile, canopy cover density at two meters, and height variance demonstrated an accurate biomass estimation for E. globulus, with coefficients of determination (R2) varying between 0.86 for MLR and 0.90 for RF. Although RF demonstrated a similar predictive performance, MLR presented advantages in terms of interpretability and computational efficiency. For P. pinaster, only MLR was applied due to the limited number of field data, yet R2 exceeded 0.80. Although absolute errors were higher for Pinus pinaster due to greater biomass variability, relative performance remained consistent across species. The results demonstrate the feasibility and efficiency of UAV LiDAR point cloud data for stand-level biomass estimation, providing simple and effective models for biomass estimation in these two species.

2025

Leveraging Large-language Models for Thematic Analysis of Children's Folk Lyrics: A comparative study of Iberian Traditions

Authors
Rodriguez, JF; Bernardes, G;

Publication
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON DIGITAL LIBRARIES FOR MUSICOLOGY, DLFM 2025

Abstract
Folk music and particularly children's folk songs serve as vital repositories of cultural identity, emotional expression, and social values. This study presents a computational thematic analysis of Portuguese and Spanish children's folk songs using the I-Folk corpus, comprising 800 annotated entries in the Music Encoding Initiative (MEI) format. Despite shared historical influences on the Iberian Peninsula, the lyrical content of each tradition reveals distinct thematic orientations. Through a methodological framework that combines traditional text pre-processing, frequency analysis, and semantic embedding using large language models (LLMs), we uncover cross-cultural similarities and divergences in content, form, and emotional register. Spanish lyrics focus primarily on caregiving, emotional development, and moral-religious motifs, while Portuguese songs emphasize performative rhythm, localized identity, and folkloric references. Our results highlight the need for tailored analytical strategies when working with children's repertoire and demonstrate the utility of LLMs in capturing culturally embedded patterns that are often obscured in conventional analyses. This work contributes to digital folklore scholarship, corpus-based ethnomusicology, and the preservation of underrepresented cultural expressions in computational humanities.

2025

Pricing Strategies for Local Transactions in Renewable Energy Communities Business Models

Authors
Sousa, J; Lucas, A; Villar, J;

Publication
2025 21ST INTERNATIONAL CONFERENCE ON THE EUROPEAN ENERGY MARKET, EEM

Abstract
The business models (BM) for renewable energy communities (REC) are often based on their promoters being the sole or primary investors in energy assets, such as photovoltaic panels (PV) and battery energy storage systems (BESS), operating these assets centrally, and selling the locally produced energy to the REC members. This research addresses the computation of fixed local energy prices that the REC developer may apply under the optimal operation of the energy assets to maximize its revenues, while guaranteeing that all REC members benefit from belonging to the REC. We do this from two perspectives, depending on who operates the storage systems: i) maximizing the investor's benefits and ii) minimizing the REC cost by maximizing its self-consumption, ensuring maximization of the energy sold by the REC promoter/investor. The optimization framework includes energy production and demand balance constraints, peak load limitations, and constraints coming from the Portuguese regulatory framework. It also considers the opportunity costs of the members for buying the energy deficit from the grid or selling the energy surplus to the grid.

  • 131
  • 4362