Publications

Publications by LIAAD

2005

On predicting protein secondary structure from their aminoacid sequences using Inductive Logic Programming

Authors
Magalhaes, A; Fonseca, NA;

Publication
2005 PORTUGUESE CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS

Abstract
We address the problem of predicting the stability of secondary structure motifs of proteins given their linear sequence of residues. Our study is restricted to the prediction of helix structures. We have applied an Inductive Logic Programming (ILP) system to automatically synthesise the predictive rules. ILP systems are well known for being able to induce comprehensible models for data. Furthermore, the models components are definitions provided by a domain expert which makes the model more likely to be helpful in the understanding of the underlying process that produced the data. Our methodology has two stages. First, the system induces a model (set of rules) using just structural information and groupings of the residues to avoid biases by the domain expert. In the second stage, the residues properties are used to make the induced rules Chemically/Biologically appealing. We claim that this methodology is also valuable for general Structure-Activity Relationship (SAR) problems.

CloseRead Abstract

2005

On applying tabling to inductive logic programming

Authors
Rocha, R; Fonseca, N; Costa, VS;

Publication
MACHINE LEARNING: ECML 2005, PROCEEDINGS

Abstract
Inductive Logic Programming (ILP) is an established subfield of Machine Learning. Nevertheless, it is recognized that efficiency and scalability is a major obstacle to an increased usage of ILP systems in complex applications with large hypotheses spaces. In this work, we focus on improving the efficiency and scalability of ILP systems by exploring tabling mechanisms available in the underlying Logic Programming systems. Tabling is an implementation technique that improves the declarativeness and performance of Prolog systems by reusing answers to subgoals. To validate our approach, we ran the April ILP system in the YapTab Prolog tabling system using two well-known datasets. The results obtained show quite impressive gains without changing the accuracy and quality of the theories generated.

CloseRead Abstract

2005

Strategies to parallelize ILP systems

Authors
Fonseca, NA; Silva, F; Camacho, R;

Publication
INDUCTIVE LOGIC PROGRAMMING, PROCEEDINGS

Abstract
It is well known by Inductive Logic Programming (ILP) practioners that ILP systems usually take a long time to find valuable models (theories). The problem is specially critical for large datasets, preventing ILP systems to scale up to larger applications. One approach to reduce the execution time has been the parallelization of ILP systems. In this paper we overview the state-of-the-art on parallel ILP implementations and present work on the evaluation of some major parallelization strategies for ILP. Conclusions about the applicability of each strategy are presented.

CloseRead Abstract

2005

Imitation networks and organizational survival in the Portuguese industry

Authors
Campos, P; Brazdil, P;

Publication
2005 Portuguese Conference on Artificial Intelligence, Proceedings

Abstract
This paper aims at evaluate the impact of imitation networks on organizations' survival rates within a Portuguese industrial cluster. We used a Multi-Agent framework to represent the industrial cluster, its firms and the rules underlying the imitation strategies. Several experiments were based on the density dependence model, where vital rates are related with the size of the population (population density). We have concluded that imitation seems to improve the vital dynamics of the population and that present information about a firm is enough to establish an imitation network.

CloseRead Abstract

2005

Predicting relative performance of classifiers from samples

Authors
Leite, R; Brazdil, P;

Publication
ICML 2005 - Proceedings of the 22nd International Conference on Machine Learning

Abstract
This paper is concerned with the problem of predicting relative performance of classification algorithms. It focusses on methods that use results on small samples and discusses the shortcomings of previous approaches. A new variant is proposed that exploits, as some previous approaches, meta-learning. The method requires that experiments be conducted on few samples. The information gathered is used to identify the nearest learning curve for which the sampling procedure was carried out fully. This in turn permits to generate a prediction regards the relative performance of algorithms. Experimental evaluation shows that the method competes well with previous approaches and provides quite good and practical solution to this problem.

CloseRead Abstract

2005

Meta-Learning

Authors
Vilalta, R; Carrier, CGG; Brazdil, P;

Publication
The Data Mining and Knowledge Discovery Handbook.

Abstract