Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2024

Predicting macroeconomic indicators from online activity data: A review

Autores
Costa, EA; Silva, ME;

Publicação
Statistical Journal of the IAOS

Abstract
Predictors of macroeconomic indicators rely primarily on traditional data sourced from National Statistical Offices. However, new data sources made available from recent technological advancements, namely data from online activities, have the potential to bring about fresh perspectives on monitoring economic activities and enhance the accuracy of forecasting. This paper reviews the literature on predicting macroeconomic indicators, such as the gross domestic product, unemployment rate, consumer price index or private consumption, based on online activity data sourced from Google Trends, Twitter (rebranded to X) and mobile devices. Based on a systematic search of publications indexed on the Web of Science and Scopus databases, the analysis of a final set of 56 publications covers the publication history of the data sources, the methods used to model the data and the predictive accuracy of information from such data sources. The paper also discusses the limitations and challenges of using online activity data for macroeconomic predictions. The review concludes that online activity data can be a valuable source of information for predicting macroeconomic indicators. However, one must consider certain limitations and challenges to improve the models' accuracy and reliability. © 2024 - IOS Press. All rights reserved.

2024

Document Level Event Extraction from Narratives

Autores
Cunha, LF;

Publicação
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT V

Abstract
One of the fundamental tasks in Information Extraction (IE) is Event Extraction (EE), an extensively studied and challenging task [13,15], which aims to identify and classify events from the text. This involves identifying the event's central word (trigger) and its participants (arguments) [1]. These elements capture the event semantics and structure, which have applications in various fields, including biomedical texts [42], cybersecurity [24], economics [12], literature [32], and history [33]. Structured knowledge derived from EE can also benefit other downstream tasks such as Question Answering [20,30], Natural Language Understanding [21], Knowledge Base Graphs [3,37], summarization [8,10,41] and recommendation systems [9,18]. Despite the existence of several English EE systems [2,22,25,26], they face limited portability to other languages [4] and most of them are designed for closed domains, posing difficulties in generalising. Furthermore, most current EE systems restrict their scope to the sentence level, assuming that all arguments are contained within the same sentence as their corresponding trigger. However, real-world scenarios often involve event arguments spanning multiple sentences, highlighting the need for document-level EE.

2024

Boosting English-Amharic machine translation using corpus augmentation and Transformer

Autores
Biadgligne, Y; Smaili, K;

Publicação
Interciencia

Abstract
The Transformer-based neural machine translation (NMT) model has been very successful in recent years and has become a new mainstream method. However, using them in lowresourced languages requires large amounts of data and efficient model configuration (hyperparameter tuning) mechanisms. The scarcity of parallel texts is a bottleneck for high quality (N) MTs, especially for under resourced languages like Amharic. As a result, this paper presents an attempt to improve English-Amharic MT by introducing three different vanilla Transformer architectures, with different hyper-parameter values. To obtain additional training material, offline token level corpus augmentation was applied to the previously collected English-Amharic parallel corpus. Compared to previous work on Amharic MT, the best of the three Transformer models have achieved state-of-the-art BLEU scores. In fact, we were able to achieve this result by employing corpus augmentation techniques and hyper-parameter tuning.

2024

Data-Centric Federated Learning for Anomaly Detection in Smart Grids and Other Industrial Control Systems

Autores
Perdigao, D; Cruz, T; Simoes, P; Abreu, PH;

Publicação
PROCEEDINGS OF 2024 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, NOMS 2024

Abstract
Energy smart grids and other modern industrial control systems networks impose considerable security management challenges due to several factors: their broad geographic dispersion and capillarity, the constrained nature of many of the devices and network links that integrate them, and the fact that they are often fragmented across multiple domains, owned and managed by different entities which often have non-aligned or even competing interests. Due to this scenario, we propose to improve federated learning-based anomaly detection for smart grids and other industrial control networks, using a federated data-centric methodology that attends to the balance and causality of the data, improving the representation of the different classes of anomalies of the ingested data, which directly impact the classifier's performance. The proposed approach shows up to 33% performance improvements in terms of F1-score for attack classification, compared to the baseline federated approach (not attending to class imbalance and causality) on a broad range of industrial control systems traffic datasets.

2024

A Survey on Group Fairness in Federated Learning: Challenges, Taxonomy of Solutions and Directions for Future Research

Autores
Salazar, T; Araújo, H; Cano, A; Abreu, PH;

Publicação
CoRR

Abstract

2024

A Perspective on the Missing at Random Problem: Synthetic Generation and Benchmark Analysis

Autores
Cabrera Sánchez, JF; Pereira, RC; Abreu, PH; Silva Ramírez, EL;

Publicação
IEEE Access

Abstract

  • 57
  • 515