2013
Authors
Vasco, D; Rodrigues, PP; Gama, J;
Publication
2013 IEEE 26TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS)
Abstract
Anomalies in data can cause a lot of problems in the data analysis processes. Thus, it is necessary to improve data quality by detecting and eliminating errors and inconsistencies in the data, known as the data cleaning process [1]. Since detection and correction of anomalies requires detailed domain knowledge, the involvement of experts in the field is essential to the success of the process of cleaning the data. However, considering the size of data to be processed, this process should be as automatic as possible so as to minimize the time spent [1]. © 2013 IEEE.
2013
Authors
Almeida, E; Kosina, P; Gama, J;
Publication
Proceedings of the 28th Annual ACM Symposium on Applied Computing, SAC '13, Coimbra, Portugal, March 18-22, 2013
Abstract
Existing works suggest that random inputs and random features produce good results in classification. In this paper we study the problem of generating random rule sets from data streams. One of the most interpretable and flexible models for data stream mining prediction tasks is the Very Fast Decision Rules learner (VFDR). In this work we extend the VFDR algorithm using random rules from data streams. The proposed algorithm generates several sets of rules. Each rule set is associated with a set of Natt attributes. The proposed algorithm maintains all properties required when learning from stationary data streams: online and any-time classification, processing each example once. Copyright 2013 ACM.
2013
Authors
Gama, J; Kosina, P; Almeida, E;
Publication
DISCOVERY SCIENCE
Abstract
The presence of anomalies in data compromises data quality and can reduce the effectiveness of learning algorithms. Standard data mining methodologies refer to data cleaning as a pre-processing before the learning task. The problem of data cleaning is exacerbated when learning in the computational model of data streams. In this paper we present a streaming algorithm for learning classification rules able to detect contextual anomalies in the data. Contextual anomalies are surprising attribute values in the context defined by the conditional part of the rule. For each example we compute the degree of anomaliness based on the probability of the attribute-values given the conditional part of the rule covering the example. The examples with high degree of anomaliness are signaled to the user and not used to train the classifier. The experimental evaluation in real-world data sets shows the ability to discover anomalous examples in the data. The main advantage of the proposed method is the ability to inform the context and explain why the anomaly occurs.
2013
Authors
Moreira Matias, L; Fernandes, R; Gama, J; Ferreira, M; Mendes Moreira, J; Damas, L;
Publication
CEUR Workshop Proceedings
Abstract
The rising fuel costs is disallowing random cruising strategies for passenger finding. Hereby, a recommendation model to suggest the most passengerprofitable urban area/stand is presented. This framework is able to combine the 1) underlying historical patterns on passenger demand and the 2) current network status to decide which is the best zone to head to in each moment. The major contribution of this work is on how to combine well-known methods for learning from data streams (such as the historical GPS traces) as an approach to solve this particular problem. The results were promising: 395.361/506.873 of the services dispatched were correctly predicted. The experiments also highlighted that a fleet equipped with such framework surpassed a fleet that is not: they experienced an average waiting time to pick-up a passenger 5% lower than its competitor. © 2013 IJCAI.
2013
Authors
Sebastião, R; da Silva, MM; Rabiço, R; Gama, J; Mendonça, T;
Publication
Evolving Systems
Abstract
This paper presents a real-time algorithm for changes detection in depth of anesthesia signals. A Page-Hinkley test (PHT) with a forgetting mechanism (PHT-FM) was developed. The samples are weighted according to their "age" so that more importance is given to recent samples. This enables the detection of the changes with less time delay than if no forgetting factor was used. The performance of the PHT-FM was evaluated in a two-fold approach. First, the algorithm was run offline in depth of anesthesia (DoA) signals previously collected during general anesthesia, allowing the adjustment of the forgetting mechanism. Second, the PHT-FM was embedded in a real-time software and its performance was validated online in the surgery room. This was performed by asking the clinician to classify in real-time the changes as true positives, false positives or false negatives. The results show that 69 % of the changes were classified as true positives, 26 % as false positives, and 5 % as false negatives. The true positives were also synchronized with changes in the hypnotic or analgesic rates made by the clinician. The contribution of this work has a high impact in the clinical practice since the PHT-FM alerts the clinician for changes in the anesthetic state of the patient, allowing a more prompt action. The results encourage the inclusion of the proposed PHT-FM in a real-time decision support system for routine use in the clinical practice. © 2012 Springer-Verlag.
2013
Authors
Oliveira, M; Gama, J;
Publication
EXPERT SYSTEMS
Abstract
Visualization of static social networks is a mature research field in information visualization. Conventional approaches rely on node-link diagrams that provide a representation of the network topology by representing nodes as points and links between them as lines. However, the increasing availability of longitudinal network data has spurred interest in visualization techniques that go beyond the static node-link representation of a network. In temporal settings, the focus is on the network dynamics at different levels of analysis (e.g. node, communities and whole network). Yet, the development of visualizations that are able to provide actionable insights into different types of changes occurring in the network and their impact on both the neighbourhood and the overall network structure is a challenging task. In such settings, traditional node-link representations can prove to be limited (Yi et al., 2010). Alternative methods, such as matrix graph representations, fail in tasks involving path finding (Ghoniem et al., 2005). This work attempts to overcome these issues by proposing a methodology for tracking the evolution of dynamic social networks, at both the node-level and the community-level, based on the concept of temporal trajectory. We resort to three-order tensors to represent evolving social networks, and we further decompose them using a Tucker3 model. The two most representative components of this model define the 2D space where the trajectories of social entities are projected. To illustrate the proposed methodology, we conduct a case study using a set of temporal self-reported friendship networks.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.