Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

2026

Turning web data into official statistics: Classifying Portuguese retail products with NLP models

Authors
Machado, JDU; Veloso, B;

Publication
STATISTICAL JOURNAL OF THE IAOS

Abstract
The growing availability of online data creates new opportunities to improve the timeliness and detail of official statistics, particularly in domains such as price monitoring and inflation measurement. However, leveraging web-scraped data for official use requires alignment with standardized classification frameworks such as the European Classification of Individual Consumption According to Purpose (ECOICOP). We train two natural-language models, a lightweight convolutional neural network (CNN) and a fine-tuned BERTimbau transformer, to classify Portuguese food and beverage items into ECOICOP categories. Using 100,000 product titles scraped from six national supermarket sites and labeled via a human-in-the-loop workflow, the CNN reaches a macro-F1 of 92.19 % with minimal computing cost, while the transformer attains 94.00 %, the first such result for Portuguese. Both models are published on Hugging Face, enabling reproducible inference at scale while the source data remain confidential. The study delivers the first open-source Portuguese ECOICOP classifiers for food and beverage products, a replicable low-resource labeling workflow, and a benchmark of accuracy-speed trade-offs to guide researchers in similar tasks.

2026

Economic benchmarking of assisted pollination methods for kiwifruit flowers: Assessment of cost-effectiveness of robotic solution

Authors
Pinheiro, I; Moura, P; Rodrigues, L; Pacheco, AP; Teixeira, JG; Valente, LG; Cunha, M; Neves Dos Santos, FN;

Publication
Agricultural Systems

Abstract
In 2023, global kiwifruit production reached over 4.4 million tonnes, highlighting the crop's significant economic importance. However, achieving high yields depends on adequate pollination. In Actinidia species, pollen is transferred by insects from male to female flowers on separate plants. Natural pollination faces increasing challenges due to the decline in pollinator populations and climate variability, driving the adoption of assisted pollination methods. This study examines the Portuguese kiwifruit sector, one of the world's top 12 producers, using a novel mixed-methods approach that integrates both qualitative and quantitative analyses to assess the feasibility of robotic pollination. The qualitative study identifies the benefits and challenges of current methods and explores how robotic pollination could address these challenges. The quantitative analysis explores the cost-effectiveness and practicality of implementing robotic pollination as a product and service. Findings indicate that most farmers use handheld pollination devices but face pollen wastage and application timing challenges. Economic analysis establishes a break-even point of €685 per hectare for an annual single application, with a first robotic pollination of €17 146 becoming cost-effective for orchards of at least 3.5 hectares and a second robotic solution of €34 293 becoming cost-effective for orchards up to 7 hectares. A robotic pollination service priced at €685 per hectare per application presents a low-risk and a viable alternative for growers. This study provides robust economic insights supporting the adoption of robotic pollination technologies. This study is crucial to make informed decisions to enhance kiwifruit production's productivity and sustainability through precise robotic-assisted pollination. © 2025 Elsevier B.V., All rights reserved.

2026

Cross-Lingual Information Retrieval in Tetun for Ad-Hoc Search

Authors
Araújo, A; de Jesus, G; Nunes, S;

Publication
Lecture Notes in Computer Science

Abstract
Developing information retrieval (IR) systems that enable access across multiple languages is crucial in multilingual contexts. In Timor-Leste, where Tetun, Portuguese, English, and Indonesian are official and working languages, no cross-lingual information retrieval (CLIR) solutions currently exist to support information access across these languages. This study addresses that gap by investigating CLIR approaches tailored to the linguistic landscape of Timor-Leste. Leveraging an existing monolingual Tetun document collection and ad-hoc text retrieval baselines, we explore the feasibility of CLIR for Tetun. Queries were manually translated into Portuguese, English, and Indonesian to create a multilingual query set. These were then automatically translated back into Tetun using Google Translate and several large language models, and used to retrieve documents in Tetun. Results show that Google Translate is the most reliable tool for Tetun CLIR overall, and the Hiemstra LM consistently outperforms BM25 and DFR BM25 in cross-lingual retrieval performance. However, overall effectiveness remains up to 26.95% points lower than that of the monolingual baseline, underscoring the limitations of current translation tools and the challenges of developing an effective CLIR for Tetun. Despite these challenges, this work establishes the first CLIR baseline for Tetun ad-hoc text retrieval, providing a foundation for future research in this under-resourced setting. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.

2026

Enhancing Medical Image Analysis: A Pipeline Combining Synthetic Image Generation and Super-Resolution

Authors
Sousa, P; Campai, D; Andrade, J; Pereira, P; Goncalves, T; Teixeira, LF; Pereira, T; Oliveira, HP;

Publication
PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2025, PT II

Abstract
Cancer is a leading cause of mortality worldwide, with breast and lung cancer being the most prevalent globally. Early and accurate diagnosis is crucial for successful treatment, and medical imaging techniques play a pivotal role in achieving this. This paper proposes a novel pipeline that leverages generative artificial intelligence to enhance medical images by combining synthetic image generation and super-resolution techniques. The framework is validated in two medical use cases (breast and lung cancers), demonstrating its potential to improve the quality and quantity of medical imaging data, ultimately contributing to more precise and effective cancer diagnosis and treatment. Overall, although some limitations do exist, this paper achieved satisfactory results for an image size which is conductive to specialist analysis, and further expands upon this field's capabilities.

2026

User Behavior in Sports Search: Entity-Centric Query and Click Log Analysis

Authors
Damas, J; Nunes, S;

Publication
Lecture Notes in Computer Science

Abstract
Understanding user behavior in search systems is essential for improving retrieval effectiveness and user satisfaction. While prior research has extensively examined general-purpose web search engines, domain-specific contexts—such as sports information—remain comparatively underexplored. In this study, we analyze over 400,000 interaction log entries from a sports-oriented search engine collected over a two-week period. Our analysis combines classic query-level metrics (e.g., frequency distributions, query lengths) with a detailed examination of click behavior, including entropy-based intent variability and a custom query quality scoring model. Compared to established baselines from general and specialized search environments, we observe a high proportion of new and single-term queries, as well as a notable lack of representativeness among top queries. These findings reveal patterns shaped by the event-driven and entity-centric nature of sports content, offering actionable insights for the design of domain-specific retrieval systems. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.

2026

Abnormal Human Behaviour Detection Using Normalising Flows and Attention Mechanisms

Authors
Nogueira, AFR; Oliveira, HP; Teixeira, LF;

Publication
PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2025, PT I

Abstract
The aim of this work is to explore normalising flows to detect anomalous behaviours which is an essential task mainly for surveillance systems-related applications. To accomplish that, a series of ablation studies were performed by varying the parameters of the Spatio-Temporal Graph Normalising Flows (STG-NF) model [3] and combining it with attention mechanisms. Out of all these experiments, it was only possible to improve the state-of-the-art result for the UBnormal dataset by 3.4 percentual points (pp), for the Avenue by 4.7 pp and for the Avenue-HR by 3.2 pp. However, further research remains urgent to find a model that can give the best performance across different scenarios. The inaccuracies of the pose tracking and estimation algorithm seems to be the main factor limiting the models' performance. The code is available at https://github.com/AnaFilipaNogueira/Abnormal-Human-Behaviour-Detection- using-Normalising-Flows-and- Attention-Mechanisms.

  • 23
  • 4407