2007
Authors
Ramos, R; Camacho, R;
Publication
ADVANCES IN DATA MINING: THEORETICAL ASPECTS AND APPLICATIONS, PROCEEDINGS
Abstract
A process of Knowledge Discovery in Databases (KDD) involving large amounts of data requires a considerable amount of computational power. The process may be done on a dedicated and expensive machinery or, for some tasks, one can use distributed computing techniques on a network of affordable machines. In either approach it is usual the user to specify the workflow of the sub-tasks composing the whole KDD process before execution starts. In this paper we propose a technique that we call Distributed Generative Data Mining. The generative feature of the technique is due to its capability of generating new sub-tasks of the Data Mining analysis process at execution time. The workflow of sub-tasks of the DM is, therefore, dynamic. To deploy the proposed technique we extended the Distributed Data Mining system HARVARD and adapted an Inductive Logic Programming system (IndLog) used in a Relational Data Ming task. As a proof-of-concept, the extended system was used to analyse an artificial dataset of a credit scoring problem with eighty million records.
2005
Authors
Talia, D; Kargupta, H; Valduriez, P; Camacho, R;
Publication
Euro-Par 2005, Parallel Processing, 11th International Euro-Par Conference, Lisbon, Portugal, August 30 - September 2, 2005, Proceedings
Abstract
2003
Authors
Camacho, R;
Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE
Abstract
Inductive Logic Programming (ILP) is a promising technology for knowledge extraction applications. ILP has produced intelligible solutions for a wide variety of domains where it has been applied. The ILP lack of efficiency is, however, a major impediment for its scalability to applications requiring large amounts of data. In this paper we propose a set of techniques that improve ILP systems efficiency and make then more likely to scale up to applications of knowledge extraction from large datasets. We propose and evaluate the lazy evaluation of examples, to improve the efficiency of ILP systems. Lazy evaluation is essentially a way to avoid or postpone the evaluation of the generated hypotheses (coverage tests). The techniques were evaluated using the IndLog system on ILP datasets referenced in the literature. The proposals lead to substantial efficiency improvements and are generally applicable to any ILP system.
2004
Authors
Camacho, R; King, R; Srinivasan, A;
Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Abstract
1994
Authors
Michie, D; Camacho, R;
Publication
Machine Intelligence 13
Abstract
1995
Authors
Camacho, R; Michie, D;
Publication
AI Magazine
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.