2020
Autores
Gama, J; Pashami, S; Bifet, A; Mouchaweh, MS; Fröning, H; Pernkopf, F; Schiele, G; Blott, M;
Publicação
IoT Streams/ITEM@PKDD/ECML
Abstract
2020
Autores
Veloso, B; Tabassum, S; Martins, C; Espanha, R; Azevedo, R; Gama, J;
Publicação
ANNALS OF TELECOMMUNICATIONS
Abstract
The high asymmetry of international termination rates is fertile ground for the appearance of fraud in telecom companies. International calls have higher values when compared with national ones, which raises the attention of fraudsters. In this paper, we present a solution for a real problem called interconnect bypass fraud, more specifically, a newly identified distributed pattern that crosses different countries and keeps fraudsters from being tracked by almost all fraud detection techniques. This problem is one of the most expressive in the telecommunication domain, and it has some abnormal behaviours like the occurrence of a burst of calls from specific numbers. Based on this assumption, we propose the adoption of a new fast forgetting technique that works together with the Lossy Counting algorithm. We apply frequent set mining to capture distributed patterns from different countries. Our goal is to detect as soon as possible items with abnormal behaviours, e.g., bursts of calls, repetitions, mirrors, distributed behaviours and a small number of calls spread by a vast set of destination numbers. The results show that the application of different techniques improves the detection ratio and not only complements the techniques used by the telecom company but also improves the performance of the Lossy Counting algorithm in terms of run-time, memory used and sensibility to detect the abnormal behaviours. Additionally, the application of frequent set mining allows us to capture distributed fraud patterns.
2020
Autores
Tabassum, S; Azad, MA; Gama, J;
Publicação
ANNALS OF TELECOMMUNICATIONS
Abstract
Fraud in telephony incurs huge revenue losses and causes a menace to both the service providers and legitimate users. This problem is growing alongside augmenting technologies. Yet, the works in this area are hindered by the availability of data and confidentiality of approaches. In this work, we deal with the problem of detecting different types of unsolicited users from spammers to fraudsters in a massive phone call network. Most of the malicious users in telecommunications have some of the characteristics in common. These characteristics can be defined by a set of features whose values are uncommon for normal users. We made use of graph-based metrics to detect profiles that are significantly far from the common user profiles in a real data log with millions of users. To achieve this, we looked for the high leverage points in the 99.99th percentile, which identified a substantial number of users as extreme anomalous points. Furthermore, clustering these points helped distinguish malicious users efficiently and minimized the problem space significantly. Convincingly, the learned profiles of these detected users coincided with fraudulent behaviors.
2020
Autores
Fujii, T; Kumano, M; Gama, J; Kimura, M;
Publicação
Complex Networks & Their Applications IX - Volume 2, Proceedings of the Ninth International Conference on Complex Networks and Their Applications, COMPLEX NETWORKS 2020, 1-3 December 2020, Madrid, Spain.
Abstract
We provide a framework for analyzing geographical influence networks that have impacts on visit event sequences for a set of point-of-interests (POIs) in a city. Since mutually-exciting Hawkes processes can naturally model temporal event data and capture interactions between those events, previous work presented a probabilistic model based on Hawkes processes, called CHP model, for finding cooperative structure among online items from their share event sequences. In this paper, based on Hawkes processes, we propose a novel probabilistic model, called RH model, for detecting geographical competitive structure in the set of POIs, and present a method of inferring it from the POI visit event history. We mathematically derive an analytical approximation formula for predicting the popularity of each of the POIs for the RH model, and also extend the CHP model so as to extract geographical cooperative structure. Using synthetic data, we first confirm the effectiveness of the inference method and the validity of the approximation formula. Using real data of Location-Based Social Networks (LBSNs), we demonstrate the significance of the RH model in terms of predicting the future events, and uncover the latent geographical influence networks from the perspective of geographical competitive and cooperative structures. © 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.
2020
Autores
Andrade, T; Cancela, B; Gama, J;
Publicação
ANNALS OF TELECOMMUNICATIONS
Abstract
Human mobility patterns are associated with many aspects of our life. With the increase of the popularity and pervasiveness of smartphones and portable devices, the Internet of Things (IoT) is turning into a permanent part of our daily routines. Positioning technologies that serve these devices such as the cellular antenna (GSM networks), global navigation satellite systems (GPS), and more recently the WiFi positioning system (WPS) provide large amounts of spatio-temporal data in a continuous way (data streams). In order to understand human behavior, the detection of important places and the movements between these places is a fundamental task. That said, the proposal of this work is a method for discovering user habits over mobility data without any a priori or external knowledge. Our approach extends a density-based clustering method for spatio-temporal data to identify meaningful places the individuals' visit. On top of that, a Gaussian mixture model (GMM) is employed over movements between the visits to automatically separate the trajectories accordingly to their key identifiers that may help describe a habit. By regrouping trajectories that look alike by day of the week, length, and starting hour, we discover the individual's habits. The evaluation of the proposed method is made over three real-world datasets. One dataset contains high-density GPS data and the others use GSM mobile phone data with 15-min sampling rate and Google Location History data with a variable sampling rate. The results show that the proposed pipeline is suitable for this task as other habits rather than just going from home to work and vice versa were found. This method can be used for understanding person behavior and creating their profiles revealing a panorama of human mobility patterns from raw mobility data.
2020
Autores
Andrade, T; Cancela, B; Gama, J;
Publicação
EXPERT SYSTEMS
Abstract
Many aspects of our lives are associated with places and the activities we perform on a daily basis. Most of them are recurrent and demand displacement of the individual between regular places like going to work, school or other important personal locations. To accomplish these recurrent daily activities, people tend to follow regular paths with similar temporal and spatial characteristics, especially because humans are frequently looking for uniformity to support their decisions and make their actions easier or even automatic. In this work, we propose a method for discovering common pathways across users' habits from human mobility data. By using a density-based clustering algorithm, we identify the most preferable locations the users visit, we apply a Gaussian mixture model over these places to automatically separate among all traces, the trajectories that follow patterns in order to discover the representations of individual's habits. By using the longest common sub-sequence algorithm, we search for the trajectories that are more similar over the set of users' habits trips by considering the distance that pairs of users or habits share on the same path. The proposed method is evaluated over two real-world GPS datasets and the results show that the approach is able to detect the most important places in a user's life, detect the routine activities and identify common routes between users that have similar habits paving the way for research techniques in carpooling, recommendation and prediction systems.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.