Publications

Publications by HASLab

2005

Slicing Functional Programs by Calculation

Authors
Rodrigues, NF; Barbosa, LS;

Publication
Beyond Program Slicing, 06.11. - 11.11.2005

Abstract

2005

Detection of hydrophobic clusters in molecular dynamics protein unfolding simulations using association rules

Authors
Azevedo, PJ; Silva, CG; Rodrigues, JR; Loureiro Ferreira, N; Brito, RMM;

Publication
BIOLOGICAL AND MEDICAL DATA ANALYSIS, PROCEEDINGS

Abstract
One way of exploring protein unfolding events associated with the development of Amyloid diseases is through the use of multiple Molecular Dynamics Protein Unfolding Simulations. The analysis of the huge amount of data generated in these simulations is not a trivial task. In the present report, we demonstrate the use of Association Rules applied to the analysis of the variation profiles of the Solvent Accessible Surface Area of the 127 amino-acid residues of the protein Transthyretin, along multiple simulations. This allowed us to identify a set of 28 hydrophobic residues forming a hydrophobic cluster that might be essential in the unfolding and folding processes of Transthyretin.

CloseRead Abstract

2005

Protein sequence classification through relevant sequence mining and Bayes Classifiers

Authors
Ferreira, PG; Azevedo, PJ;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS

Abstract
We tackle the problem of sequence classification using relevant subsequences found in a dataset of protein labelled sequences. A subsequence is relevant if it is frequent and has a minimal length. For each query sequence a vector of features is obtained. The features consist in the number and average length of the relevant subsequences shared with each of the protein families. Classification is performed by combining these features in a Bayes Classifier. The combination of these characteristics results in a multi-class and multi-domain method that is exempt of data transformation and background knowledge. We illustrate the performance of our method using three collections of protein datasets. The performed tests showed that the method has an equivalent performance to state of the art methods in protein classification.

CloseRead Abstract

2005

Protein sequence pattern mining with constraints

Authors
Ferreira, PG; Azevedo, PJ;

Publication
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2005

Abstract
Considering the characteristics of biological sequence databases, which typically have a small alphabet, a very long length and a relative small size (several hundreds of sequences), we propose a new sequence mining algorithm (gIL). gIL was developed for linear sequence pattern mining and results from the combination of some of the most efficient techniques used in sequence and itemset mining. The algorithm exhibits a high adaptability, yielding a smooth and direct introduction of various types of features into the mining process, namely the extraction of rigid and arbitrary gap patterns. Both breadth or a depth first traversal are possible. The experimental evaluation, in synthetic and real life protein databases, has shown that our algorithm has superior performance to state-of-the art algorithms. The use of constraints has also proved to be a very useful tool to specify user interesting patterns.

CloseRead Abstract

2005

A Hybrid Method for Discovering Distance-Enhanced Inter-Transactional Rules

Authors
Ferreira, PG; Alves, R; Azevedo, PJ; Belo, O;

Publication
Actas de las X Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2005), September 14-16, 2005, Granada, Spain

Abstract

2005

CMB'05: Workshop on Computational Methods in Bioinformatics

Authors
Camacho, R; Alves, A; da Costa, JP; Azevedo, P;

Publication
2005 Portuguese Conference on Artificial Intelligence, Proceedings

Abstract