Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por CRACS

2021

Can Fake News Detection Models Maintain the Performance through Time? A Longitudinal Evaluation of Twitter Publications

Autores
Guimaraes, N; Figueira, A; Torgo, L;

Publicação
MATHEMATICS

Abstract
The negative impact of false information on social networks is rapidly growing. Current research on the topic focused on the detection of fake news in a particular context or event (such as elections) or using data from a short period of time. Therefore, an evaluation of the current proposals in a long-term scenario where the topics discussed may change is lacking. In this work, we deviate from current approaches to the problem and instead focus on a longitudinal evaluation using social network publications spanning an 18-month period. We evaluate different combinations of features and supervised models in a long-term scenario where the training and testing data are ordered chronologically, and thus the robustness and stability of the models can be evaluated through time. We experimented with 3 different scenarios where the models are trained with 15-, 30-, and 60-day data periods. The results show that detection models trained with word-embedding features are the ones that perform better and are less likely to be affected by the change of topics (for example, the rise of COVID-19 conspiracy theories). Furthermore, the additional days of training data also increase the performance of the best feature/model combinations, although not very significantly (around 2%). The results presented in this paper build the foundations towards a more pragmatic approach to the evaluation of fake news detection models in social networks.

2021

An organized review of key factors for fake news detection

Autores
Guimarães, N; Figueira, A; Torgo, L;

Publicação
CoRR

Abstract

2021

Analysing students' interaction sequences on Moodle to predict academic performance

Autores
Cunha, A; Figueira, Á;

Publicação
CEUR Workshop Proceedings

Abstract
As e-Learning systems have become gradually prevalent, forcing a (sometimes needed) physical distance between lecturers and their students, new methods need to emerge to fill this enlarging gap. Educators need, more than ever, systems capable of warning them (and the students) of situations that might create future problems for the learning process. The capacity to give and get feedback is naturally the best way to overcome this problem. However, in e-learning contexts, with dozens or hundreds of students, the solution becomes less simple. In this work we propose a system capable of continuously giving feedback on the performance of the students based on the interaction sequences they undertake with the LMS. This work innovates in what concerns the sequences of activity accesses together with the computation of the duration of these online learning activities, which are then encoded and fed into machine learning algorithms. We used a longitudinal experiment from five academic years. From our set of classifiers, the Random Forest obtained the best results for preventing low grades, with an accuracy of nearly 87%.

2021

Time series analysis via network science: Concepts and algorithms

Autores
Silva, VF; Silva, ME; Ribeiro, P; Silva, F;

Publicação
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
There is nowadays a constant flux of data being generated and collected in all types of real world systems. These data sets are often indexed by time, space, or both requiring appropriate approaches to analyze the data. In univariate settings, time series analysis is a mature field. However, in multivariate contexts, time series analysis still presents many limitations. In order to address these issues, the last decade has brought approaches based on network science. These methods involve transforming an initial time series data set into one or more networks, which can be analyzed in depth to provide insight into the original time series. This review provides a comprehensive overview of existing mapping methods for transforming time series into networks for a wide audience of researchers and practitioners in machine learning, data mining, and time series. Our main contribution is a structured review of existing methodologies, identifying their main characteristics, and their differences. We describe the main conceptual approaches, provide authoritative references and give insight into their advantages and limitations in a unified way and language. We first describe the case of univariate time series, which can be mapped to single layer networks, and we divide the current mappings based on the underlying concept: visibility, transition, and proximity. We then proceed with multivariate time series discussing both single layer and multiple layer approaches. Although still very recent, this research area has much potential and with this survey we intend to pave the way for future research on the topic. This article is categorized under: Fundamental Concepts of Data and Knowledge > Data Concepts Fundamental Concepts of Data and Knowledge > Knowledge Representation

2021

A Survey on Subgraph Counting: Concepts, Algorithms, and Applications to Network Motifs and Graphlets

Autores
Ribeiro, P; Paredes, P; Silva, MEP; Aparicio, D; Silva, F;

Publicação
ACM COMPUTING SURVEYS

Abstract
Computing subgraph frequencies is a fundamental task that lies at the core of several network analysis methodologies, such as network motifs and graphlet-based metrics, which have been widely used to categorize and compare networks from multiple domains. Counting subgraphs is, however, computationally very expensive, and there has been a large body of work on efficient algorithms and strategies to make subgraph counting feasible for larger subgraphs and networks. This survey aims precisely to provide a comprehensive overview of the existing methods for subgraph counting. Our main contribution is a general and structured review of existing algorithms, classifying them on a set of key characteristics, highlighting their main similarities and differences. We identify and describe the main conceptual approaches, giving insight on their advantages and limitations, and we provide pointers to existing implementations. We initially focus on exact sequential algorithms, but we also do a thorough survey on approximate methodologies (with a trade-off between accuracy and execution time) and parallel strategies (that need to deal with an unbalanced search space).

2021

Energy-aware adaptive offloading of soft real-time jobs in mobile edge clouds

Autores
Silva, J; Marques, ERB; Lopes, LMB; Silva, F;

Publicação
JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS

Abstract
We present a model for measuring the impact of offloading soft real-time jobs over multi-tier cloud infrastructures. The jobs originate in mobile devices and offloading strategies may choose to execute them locally, in neighbouring devices, in cloudlets or in infrastructure cloud servers. Within this specification, we put forward several such offloading strategies characterised by their differential use of the cloud tiers with the goal of optimizing execution time and/or energy consumption. We implement an instance of the model using Jay, a software framework for adaptive computation offloading in hybrid edge clouds. The framework is modular and allows the model and the offloading strategies to be seamlessly implemented while providing the tools to make informed runtime offloading decisions based on system feedback, namely through a built-in system profiler that gathers runtime information such as workload, energy consumption and available bandwidth for every participating device or server. The results show that offloading strategies sensitive to runtime conditions can effectively and dynamically adjust their offloading decisions to produce significant gains in terms of their target optimization functions, namely, execution time, energy consumption and fulfilment of job deadlines.

  • 32
  • 201