Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2023

Estimating Instantaneous Vehicle Emissions

Autores
Andrade, T; Gama, J;

Publicação
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023

Abstract
Road transportation emissions have increased in the last few decades and have been the primary source of pollutants in urban areas with ever-growing populations. In this context, it is important to have effective measures to monitor road emissions in regions. Creating an emissions inventory over a region that can map road emissions based on vehicle trips can be helpful. In this work, we show that it is possible to use raw GPS data to estimate vehicle-related levels of pollution in a region. By transforming the data using feature engineering and calculating the vehicle-specific power (VSP) as well as various specific pollutants by using a microscopic emissions model, we show the areas with higher emissions levels made by a fleet of taxis in Porto, Portugal.

2023

A DTW Approach for Complex Data A Case Study with Network Data Streams

Autores
Silva, PR; Vinagre, J; Gama, J;

Publicação
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023

Abstract
Dynamic Time Warping (DTW) is a robust method to measure the similarity between two sequences. This paper proposes a method based on DTW to analyse high-speed data streams. The central idea is to decompose the network traffic into sequences of histograms of packet sizes and then calculate the distance between pairs of such sequences using DTW with Kullback-Leibler (KL) distance. As a baseline, we also compute the Euclidean Distance between the sequences of histograms. Since our preliminary experiments indicate that the distance between two sequences falls within a different range of values for distinct types of streams, we then exploit this distance information for stream classification using a Random Forest. The approach was investigated using recent internet traffic data from a telecommunications company. To illustrate the application of our approach, we conducted a case study with encrypted Internet Protocol Television (IPTV) network traffic data. The goal was to use our DTW-based approach to detect the video codec used in the streams, as well as the IPTV channel. Results strongly suggest that the DTW distance value between the data streams is highly informative for such classification tasks.

2023

Data Stream Analytics

Autores
Aguilar Ruiz, S; Bifet, A; Gama, J;

Publicação
Analytics

Abstract
[No abstract available]

2023

Towards federated learning: An overview of methods and applications

Autores
Silva, PR; Vinagre, J; Gama, J;

Publicação
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Federated learning (FL) is a collaborative, decentralized privacy-preserving method to attach the challenges of storing data and data privacy. Artificial intelligence, machine learning, smart devices, and deep learning have strongly marked the last years. Two challenges arose in data science as a result. First, the regulation protected the data by creating the General Data Protection Regulation, in which organizations are not allowed to keep or transfer data without the owner's authorization. Another challenge is the large volume of data generated in the era of big data, and keeping that data in one only server becomes increasingly tricky. Therefore, the data is allocated into different locations or generated by devices, creating the need to build models or perform calculations without transferring data to a single location. The new term FL emerged as a sub-area of machine learning that aims to solve the challenge of making distributed models with privacy considerations. This survey starts by describing relevant concepts, definitions, and methods, followed by an in-depth investigation of federated model evaluation. Finally, we discuss three promising applications for further research: anomaly detection, distributed data streams, and graph representation.This article is categorized under:Technologies > Machine LearningTechnologies > Artificial Intelligence

2023

WINTENDED: WINdowed TENsor decomposition for Densification Event Detection in time-evolving networks

Autores
Fernandes, S; Fanaee T, H; Gama, J; Tisljaric, L; Smuc, T;

Publicação
MACHINE LEARNING

Abstract
Densification events in time-evolving networks refer to instants in which the network density, that is, the number of edges, is substantially larger than in the remaining. These events can occur at a global level, involving the majority of the nodes in the network, or at a local level involving only a subset of nodes.While global densification events affect the overall structure of the network, the same does not hold in local densification events, which may remain undetectable by the existing detection methods. In order to address this issue, we propose WINdowed TENsor decomposition for Densification Event Detection (WINTENDED) for the detection and characterization of both global and local densification events. Our method combines a sliding window decomposition with statistical tools to capture the local dynamics of the network and automatically find the irregular behaviours. According to our experimental evaluation, WINTENDED is able to spot global densification events at least as accurately as its competitors, while also being able to find local densification events, on the contrary to its competitors.

2023

Social network analytics and visualization: Dynamic topic-based influence analysis in evolving micro-blogs

Autores
Tabassum, S; Gama, J; Azevedo, PJ; Cordeiro, M; Martins, C; Martins, A;

Publicação
EXPERT SYSTEMS

Abstract
Influence Analysis is one of the well-known areas of Social Network Analysis. However, discovering influencers from micro-blog networks based on topics has gained recent popularity due to its specificity. Besides, these data networks are massive, continuous and evolving. Therefore, to address the above challenges we propose a dynamic framework for topic modelling and identifying influencers in the same process. It incorporates dynamic sampling, community detection and network statistics over graph data stream from a social media activity management application. Further, we compare the graph measures against each other empirically and observe that there is no evidence of correlation between the sets of users having large number of friends and the users whose posts achieve high acceptance (i.e., highly liked, commented and shared posts). Therefore, we propose a novel approach that incorporates a user's reachability and also acceptability by other users. Consequently, we improve on graph metrics by including a dynamic acceptance score (integrating content quality with network structure) for ranking influencers in micro-blogs. Additionally, we analysed the topic clusters' structure and quality with empirical experiments and visualization.

  • 76
  • 516