2002
Authors
Jorge, A; Pocas, J; Azevedo, P;
Publication
DISCOVERY SCIENCE, PROCEEDINGS
Abstract
Association rule engines typically output a very large set of rules. Despite the fact that association rules are regarded as highly comprehensible and useful for data mining and decision support in fields such as marketing, retail, demographics, among others, lengthy outputs may discourage users from using the technique. In this paper we propose a post-processing methodology and tool for browsing/visualizing large sets of association rules. The method is based on a set of operators that transform sets of rules into sets of rules, allowing focusing on interesting regions of the rule space. Each set of rules can be then seen with different graphical representations. The tool is web-based and uses SVG. Association rules are given in PMML.
2002
Authors
Jorge, A; Moyle, S; Voss, A;
Publication
COLLABORATIVE BUSINESS ECOSYSTEMS AND VIRTUAL ENTERPRISES
Abstract
The basic principles of a methodology for remote collaborative data mining are proposed. Starting from CRISP-DM, a general data mining process designed to carry out data mining projects; it is described how the principles of knowledge sharing and ease of communication can be embedded in the data mining process, The aim is to allow the execution of data mining projects, with the participation of multiple experts working from distant locations. All the participants in such a project can profit from the knowledge produced by others and share their knowledge online with the other participants. The produced knowledge (for example data transformations, working hypothesis, models, results of experiments) is also stored for future inspection and use, in pursuit of organizational learning. A prototypical implementation (RAMSYS) of the remote collaborative methodology is described with examples.
2002
Authors
Peng, YH; Flach, PA; Soares, C; Brazdil, P;
Publication
DISCOVERY SCIENCE, PROCEEDINGS
Abstract
This paper presents new measures, based on the induced decision tree, to characterise datasets for meta-learning in order to select appropriate learning algorithms. The main idea is to capture the characteristics of dataset from the structural shape and size of decision tree induced from the dataset. Totally 15 measures are proposed to describe the structure of a decision tree. Their effectiveness is illustrated through extensive experiments, by comparing to the results obtained by the existing data characteristics techniques, including data characteristics tool (DCT) that is the most wide used technique in meta-learning, and Landmarking that is the most recently developed method.
2002
Authors
Soares, C; Brazdil, P;
Publication
ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2002, PROCEEDINGS
Abstract
Cross-validation (CV) is the most accurate method available for algorithm recommendation but it is rather slow. We show that information about the past performance of algorithms can be used for the same purpose with small loss in accuracy and significant savings in experimentation time. We use a meta-learning framework that combines a simple IBL algorithm with a ranking method. We show that results improve significantly by using a set of selected measures that represent data characteristics that permit to predict algorithm performance. Our results also indicate that the choice of ranking method as a smaller effect on the quality of recommendations. Finally, we present situations that illustrate the advantage of providing recommendation as a ranking of the candidate algorithms, rather than as the single algorithm which is expected to perform best.
2002
Authors
Gama, J; Castillo, G;
Publication
ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2002, PROCEEDINGS
Abstract
Several researchers have studied the application of Machine Learning techniques to the task of user modeling. As most of them pointed out, this task requires learning algorithms that should work on-line, incorporate new information incrementality, and should exhibit the capacity to deal with concept-drift. In this paper we present Adaptive Bayes, an extension to the well-known naive-Bayes, one of the most common used learning algorithms for the task of user modeling. Adaptive Bayes is an incremental learning algorithm that could work on-line. We have evaluated Adaptive Bayes on both frameworks. Using a set of benchmark problems from the UCI repository [2], and using several evaluation statistics, all the adaptive systems show significant advantages in comparison against their non-adaptive versions.
2002
Authors
Gama, J;
Publication
Machine Learning, Proceedings of the Nineteenth International Conference (ICML 2002), University of New South Wales, Sydney, Australia, July 8-12, 2002
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.