Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2022

ZeroBERTo: Leveraging Zero-Shot Text Classification by Topic Modeling

Autores
Alcoforado, A; Ferraz, TP; Gerber, R; Bustos, E; Oliveira, AS; Veloso, BM; Siqueira, FL; Costa, AHR;

Publicação
COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2022

Abstract
Traditional text classification approaches often require a good amount of labeled data, which is difficult to obtain, especially in restricted domains or less widespread languages. This lack of labeled data has led to the rise of low-resource methods, that assume low data availability in natural language processing. Among them, zero-shot learning stands out, which consists of learning a classifier without any previously labeled data. The best results reported with this approach use language models such as Transformers, but fall into two problems: high execution time and inability to handle long texts as input. This paper proposes a new model, ZeroBERTo, which leverages an unsupervised clustering step to obtain a compressed data representation before the classification task. We show that ZeroBERTo has better performance for long inputs and shorter execution time, outperforming XLM-R by about 12% in the F1 score in the FolhaUOL dataset.

2022

Personalised Combination of Multi-Source Data for User Profiling

Autores
Veloso, B; Leal, F; Malheiro, B;

Publicação
Lecture Notes in Networks and Systems

Abstract
Human interaction with intelligent systems, services, and devices generates large volumes of user-related data. This multi-source information can be used to build richer user profiles and improve personalization. Our goal is to combine multi-source data to create user profiles by assigning dynamic individual weights. This paper describes a multi-source user profiling methodology and illustrates its application with a film recommendation system. The contemplated data sources include (i) personal history, (ii) explicit preferences (ratings), and (iii) social activities (likes, comments, or shares). The MovieLens dataset was selected and adapted to assess our approach by comparing the standard and the proposed methodologies. In the standard approach, we calculate the best global weights to apply to the different profile sources and generate all user profiles accordingly. In the proposed approach, we determine, for each user, individual weights for the different profile sources. The approach proved to be an efficient solution to a complex problem by continuously updating the individual data source weights and improving the accuracy of the generated personalised multimedia recommendations. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

2022

Simulation, modelling and classification of wiki contributors: Spotting the good, the bad, and the ugly

Autores
Garcia-Mendez, S; Leal, F; Malheiro, B; Burguillo-Rial, JC; Veloso, B; Chis, AE; Gonzalez-Velez, H;

Publicação
SIMULATION MODELLING PRACTICE AND THEORY

Abstract
Data crowdsourcing is a data acquisition process where groups of voluntary contributors feed platforms with highly relevant data ranging from news, comments, and media to knowledge and classifications. It typically processes user-generated data streams to provide and refine popular services such as wikis, collaborative maps, e-commerce sites, and social networks. Nevertheless, this modus operandi raises severe concerns regarding ill-intentioned data manipulation in adver-sarial environments. This paper presents a simulation, modelling, and classification approach to automatically identify human and non-human (bots) as well as benign and malign contributors by using data fabrication to balance classes within experimental data sets, data stream modelling to build and update contributor profiles and, finally, autonomic data stream classification. By employing WikiVoyage - a free worldwide wiki travel guide open to contribution from the general public - as a testbed, our approach proves to significantly boost the confidence and quality of the classifier by using a class-balanced data stream, comprising both real and synthetic data. Our empirical results show that the proposed method distinguishes between benign and malign bots as well as human contributors with a classification accuracy of up to 92%.

2022

Smart Contracts for the CloudAnchor Platform

Autores
Vasco, E; Veloso, B; Malheiro, B;

Publicação
Advances in Practical Applications of Agents, Multi-Agent Systems, and Complex Systems Simulation. The PAAMS Collection - 20th International Conference, PAAMS 2022, L'Aquila, Italy, July 13-15, 2022, Proceedings

Abstract
CloudAnchor is a multi-agent brokerage platform for the negotiation of Infrastructure as a Service cloud resources between Small and Medium Sized Enterprises, acting either as providers or consumers. This project entails the research, design, and implementation of a smart contract solution to permanently record and manage contractual and behavioural stakeholder data on a blockchain network. Smart contracts enable safe contract code execution, increasing trust between parties and ensuring the integrity and traceability of the chained contents. The defined smart contracts represent the inter-business trustworthiness and Service Level Agreements established within the platform. CloudAnchor interacts with the blockchain network through a dedicated Application Programming Interface, which coordinates and optimises the submission of transactions. The performed tests indicate the success of this integration: (i) the number and value of negotiated resources remain identical; and (ii) the run-time increases due to the inherent latency of the blockchain operation. Nonetheless, the introduced latency does not affect the brokerage performance, proving to be an appropriate solution for reliable partner selection and contractual enforcement between untrusted parties. This novel approach stores all brokerage strategic knowledge in a distributed, decentralised, and immutable database. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

2022

Interpretable Success Prediction in Higher Education Institutions Using Pedagogical Surveys

Autores
Leal, F; Veloso, B; Pereira, CS; Moreira, F; Durao, N; Silva, NJ;

Publicação
SUSTAINABILITY

Abstract
The indicators of student success at higher education institutions are continuously analysed to increase the students' enrolment in multiple scientific areas. Every semester, the students respond to a pedagogical survey that aims to collect the student opinion of curricular units in terms of content and teaching methodologies. Using this information, we intend to anticipate the success in higher-level courses and prevent dropouts. Specifically, this paper contributes with an interpretable student classification method. The proposed solution relies on (i) a pedagogical survey to collect student's opinions; (ii) a statistical data analysis to validate the reliability of the survey; and (iii) machine learning algorithms to classify the success of a student. In addition, the proposed method includes an explainable mechanism to interpret the classifications and their main factors. This transparent pipeline was designed to have implications in both digital and sustainable education, impacting the three pillars of sustainability, i.e.,economic, social, and environmental, where transparency is a cornerstone. The work was assessed with a dataset from a Portuguese higher-level institution, contemplating multiple courses from different departments. The most promising results were achieved with Random Forest presenting 98% in accuracy and F-measure.

2022

Challenges of Data-Driven Decision Models: Implications for Developers and for Public Policy Decision-Makers

Autores
Teixeira, S; Rodrigues, JC; Veloso, B; Gama, J;

Publicação
Advances in Urban Design and Engineering

Abstract

  • 86
  • 469