Publications

Publications by Rui Camacho

2007

Distributed generative data mining

Authors
Ramos, R; Camacho, R;

Publication
ADVANCES IN DATA MINING: THEORETICAL ASPECTS AND APPLICATIONS, PROCEEDINGS

Abstract
A process of Knowledge Discovery in Databases (KDD) involving large amounts of data requires a considerable amount of computational power. The process may be done on a dedicated and expensive machinery or, for some tasks, one can use distributed computing techniques on a network of affordable machines. In either approach it is usual the user to specify the workflow of the sub-tasks composing the whole KDD process before execution starts. In this paper we propose a technique that we call Distributed Generative Data Mining. The generative feature of the technique is due to its capability of generating new sub-tasks of the Data Mining analysis process at execution time. The workflow of sub-tasks of the DM is, therefore, dynamic. To deploy the proposed technique we extended the Distributed Data Mining system HARVARD and adapted an Inductive Logic Programming system (IndLog) used in a Relational Data Ming task. As a proof-of-concept, the extended system was used to analyse an artificial dataset of a credit scoring problem with eighty million records.

CloseRead Abstract

2005

Topic 5 - Parallel and Distributed Databases, Data Mining and Knowledge Discovery

Authors
Talia, D; Kargupta, H; Valduriez, P; Camacho, R;

Publication
Euro-Par 2005, Parallel Processing, 11th International Euro-Par Conference, Lisbon, Portugal, August 30 - September 2, 2005, Proceedings

Abstract

2003

Improving the efficiency of ILP systems

Authors
Camacho, R;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
Inductive Logic Programming (ILP) is a promising technology for knowledge extraction applications. ILP has produced intelligible solutions for a wide variety of domains where it has been applied. The ILP lack of efficiency is, however, a major impediment for its scalability to applications requiring large amounts of data. In this paper we propose a set of techniques that improve ILP systems efficiency and make then more likely to scale up to applications of knowledge extraction from large datasets. We propose and evaluate the lazy evaluation of examples, to improve the efficiency of ILP systems. Lazy evaluation is essentially a way to avoid or postpone the evaluation of the generated hypotheses (coverage tests). The techniques were evaluated using the IndLog system on ILP datasets referenced in the literature. The proposals lead to substantial efficiency improvements and are generally applicable to any ILP system.

CloseRead Abstract

2004

Preface

Authors
Camacho, R; King, R; Srinivasan, A;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract

1994

Building symbolic representations of intuitive real-time skills from performance data

Authors
Michie, D; Camacho, R;

Publication
Machine Intelligence 13

Abstract

1995

Behavioral Cloning A Correction

Authors
Camacho, R; Michie, D;

Publication
AI Magazine

Abstract