2010
Autores
Castro, N; Azevedo, P;
Publicação
Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010
Abstract
Time series motif discovery is an important problem with applications in a variety of areas that range from telecommunications to medicine. Several algorithms have been proposed to solve the problem. However, these algorithms heavily use expensive random disk accesses or assume the data can fit into main memory. They only consider motifs at a single resolution and are not suited to interactivity. In this work, we tackle the motif discovery problem as an approximate Top-K frequent subsequence discovery problem. We fully exploit state of the art iSAX representation multiresolution capability to obtain motifs at different resolutions. This property yields interactivity, allowing the user to navigate along the Top-K motifs structure. This permits a deeper understanding of the time series database. Further, we apply the Top-K space saving algorithm to our frequent subsequences approach. A scalable algorithm is obtained that is suitable for data stream like applications where small memory devices such as sensors are used. Our approach is scalable and disk-efficient since it only needs one single pass over the time series database. We provide empirical evidence of the validity of the algorithm in datasets from different areas that aim to represent practical applications. Copyright © by SIAM.
2010
Autores
Azevedo, PJ; Jorge, AM;
Publicação
DATA MINING AND KNOWLEDGE DISCOVERY
Abstract
The ensembling of classifiers tends to improve predictive accuracy. To obtain an ensemble with N classifiers, one typically needs to run N learning processes. In this paper we introduce and explore Model Jittering Ensembling, where one single model is perturbed in order to obtain variants that can be used as an ensemble. We use as base classifiers sets of classification association rules. The two methods of jittering ensembling we propose are Iterative Reordering Ensembling (IRE) and Post Bagging (PB). Both methods start by learning one rule set over a single run, and then produce multiple rule sets without relearning. Empirical results on 36 data sets are positive and show that both strategies tend to reduce error with respect to the single model association rule classifier. A bias-variance analysis reveals that while both IRE and PB are able to reduce the variance component of the error, IRE is particularly effective in reducing the bias component. We show that Model Jittering Ensembling can represent a very good speed-up w.r.t. multiple model learning ensembling. We also compare Model Jittering with various state of the art classifiers in terms of predictive accuracy and computational efficiency.
2010
Autores
Pacheco, H; Cunha, A;
Publicação
MATHEMATICS OF PROGRAM CONSTRUCTION, PROCEEDINGS
Abstract
Lenses are one the most popular approaches to define bidirectional transformations between data models. A bidirectional transformation with view-update, denoted a lens, encompasses the definition of a forward transformation projecting concrete models into abstract views, together with a backward transformation instructing how to translate an abstract view to an update over concrete models. In this paper we show that most of the standard point-free combinators can be lifted to lenses with suitable backward semantics, allowing us to use the point-free style to define powerful bidirectional transformations by composition. We also demonstrate how to define generic lenses over arbitrary inductive data types by lifting standard recursion patterns, like folds or unfolds. To exemplify the power of this approach, we "lensify" some standard functions over naturals and lists, which are tricky to define directly "by-hand" using explicit recursion.
2010
Autores
Muschevici, R; Clarke, D; Proença, J;
Publicação
Software Product Lines - 14th International Conference, SPLC 2010, Jeju Island, South Korea, September 13-17, 2010. Workshop Proceedings (Volume 2 : Workshops, Industrial Track, Doctoral Symposium, Demonstrations and Tools)
Abstract
2010
Autores
Clarke, D; Proença, J;
Publicação
Software Product Lines - 14th International Conference, SPLC 2010, Jeju Island, South Korea, September 13-17, 2010. Workshop Proceedings (Volume 2 : Workshops, Industrial Track, Doctoral Symposium, Demonstrations and Tools)
Abstract
2010
Autores
Debattista, K; Proenca, AJ; Santos, LP;
Publicação
COMPUTERS & GRAPHICS-UK
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.