Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2020

Profiling high leverage points for detecting anomalous users in telecom data networks

Autores
Tabassum, S; Azad, MA; Gama, J;

Publicação
ANNALS OF TELECOMMUNICATIONS

Abstract
Fraud in telephony incurs huge revenue losses and causes a menace to both the service providers and legitimate users. This problem is growing alongside augmenting technologies. Yet, the works in this area are hindered by the availability of data and confidentiality of approaches. In this work, we deal with the problem of detecting different types of unsolicited users from spammers to fraudsters in a massive phone call network. Most of the malicious users in telecommunications have some of the characteristics in common. These characteristics can be defined by a set of features whose values are uncommon for normal users. We made use of graph-based metrics to detect profiles that are significantly far from the common user profiles in a real data log with millions of users. To achieve this, we looked for the high leverage points in the 99.99th percentile, which identified a substantial number of users as extreme anomalous points. Furthermore, clustering these points helped distinguish malicious users efficiently and minimized the problem space significantly. Convincingly, the learned profiles of these detected users coincided with fraudulent behaviors.

2020

Detecting Geographical Competitive Structure for POI Visit Dynamics

Autores
Fujii, T; Kumano, M; Gama, J; Kimura, M;

Publicação
Complex Networks & Their Applications IX - Volume 2, Proceedings of the Ninth International Conference on Complex Networks and Their Applications, COMPLEX NETWORKS 2020, 1-3 December 2020, Madrid, Spain.

Abstract
We provide a framework for analyzing geographical influence networks that have impacts on visit event sequences for a set of point-of-interests (POIs) in a city. Since mutually-exciting Hawkes processes can naturally model temporal event data and capture interactions between those events, previous work presented a probabilistic model based on Hawkes processes, called CHP model, for finding cooperative structure among online items from their share event sequences. In this paper, based on Hawkes processes, we propose a novel probabilistic model, called RH model, for detecting geographical competitive structure in the set of POIs, and present a method of inferring it from the POI visit event history. We mathematically derive an analytical approximation formula for predicting the popularity of each of the POIs for the RH model, and also extend the CHP model so as to extract geographical cooperative structure. Using synthetic data, we first confirm the effectiveness of the inference method and the validity of the approximation formula. Using real data of Location-Based Social Networks (LBSNs), we demonstrate the significance of the RH model in terms of predicting the future events, and uncover the latent geographical influence networks from the perspective of geographical competitive and cooperative structures. © 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.

2020

Discovering locations and habits from human mobility data

Autores
Andrade, T; Cancela, B; Gama, J;

Publicação
ANNALS OF TELECOMMUNICATIONS

Abstract
Human mobility patterns are associated with many aspects of our life. With the increase of the popularity and pervasiveness of smartphones and portable devices, the Internet of Things (IoT) is turning into a permanent part of our daily routines. Positioning technologies that serve these devices such as the cellular antenna (GSM networks), global navigation satellite systems (GPS), and more recently the WiFi positioning system (WPS) provide large amounts of spatio-temporal data in a continuous way (data streams). In order to understand human behavior, the detection of important places and the movements between these places is a fundamental task. That said, the proposal of this work is a method for discovering user habits over mobility data without any a priori or external knowledge. Our approach extends a density-based clustering method for spatio-temporal data to identify meaningful places the individuals' visit. On top of that, a Gaussian mixture model (GMM) is employed over movements between the visits to automatically separate the trajectories accordingly to their key identifiers that may help describe a habit. By regrouping trajectories that look alike by day of the week, length, and starting hour, we discover the individual's habits. The evaluation of the proposed method is made over three real-world datasets. One dataset contains high-density GPS data and the others use GSM mobile phone data with 15-min sampling rate and Google Location History data with a variable sampling rate. The results show that the proposed pipeline is suitable for this task as other habits rather than just going from home to work and vice versa were found. This method can be used for understanding person behavior and creating their profiles revealing a panorama of human mobility patterns from raw mobility data.

2020

From mobility data to habits and common pathways

Autores
Andrade, T; Cancela, B; Gama, J;

Publicação
EXPERT SYSTEMS

Abstract
Many aspects of our lives are associated with places and the activities we perform on a daily basis. Most of them are recurrent and demand displacement of the individual between regular places like going to work, school or other important personal locations. To accomplish these recurrent daily activities, people tend to follow regular paths with similar temporal and spatial characteristics, especially because humans are frequently looking for uniformity to support their decisions and make their actions easier or even automatic. In this work, we propose a method for discovering common pathways across users' habits from human mobility data. By using a density-based clustering algorithm, we identify the most preferable locations the users visit, we apply a Gaussian mixture model over these places to automatically separate among all traces, the trajectories that follow patterns in order to discover the representations of individual's habits. By using the longest common sub-sequence algorithm, we search for the trajectories that are more similar over the set of users' habits trips by considering the distance that pairs of users or habits share on the same path. The proposed method is evaluated over two real-world GPS datasets and the results show that the approach is able to detect the most important places in a user's life, detect the routine activities and identify common routes between users that have similar habits paving the way for research techniques in carpooling, recommendation and prediction systems.

2020

Multi-label Stream Classification with Self-Organizing Maps

Autores
Cerri, R; Costa Júnior, JD; Faria Paiva, ERd; da Gama, JMP;

Publicação
CoRR

Abstract

2020

Gastric Microbiome Diversities in Gastric Cancer Patients from Europe and Asia Mimic the Human Population Structure and Are Partly Driven by Microbiome Quantitative Trait Loci

Autores
Cavadas, B; Camacho, R; Ferreira, JC; Ferreira, RM; Figueiredo, C; Brazma, A; Fonseca, NA; Pereira, L;

Publicação
MICROORGANISMS

Abstract
The human gastrointestinal tract harbors approximately 100 trillion microorganisms with different microbial compositions across geographic locations. In this work, we used RNASeq data from stomach samples of non-disease (164 individuals from European ancestry) and gastric cancer patients (137 from Europe and Asia) from public databases. Although these data were intended to characterize the human expression profiles, they allowed for a reliable inference of the microbiome composition, as confirmed from measures such as the genus coverage, richness and evenness. The microbiome diversity (weighted UniFrac distances) in gastric cancer mimics host diversity across the world, with European gastric microbiome profiles clustering together, distinct from Asian ones. Despite the confirmed loss of microbiome diversity from a healthy status to a cancer status, the structured profile was still recognized in the disease condition. In concordance with the parallel host-bacteria population structure, we found 16 human loci (non-synonymous variants) in the European-descendent cohorts that were significantly associated with specific genera abundance. These microbiome quantitative trait loci display heterogeneity between population groups, being mainly linked to the immune system or cellular features that may play a role in enabling microbe colonization and inflammation.

  • 147
  • 496