2005
Authors
Rodrigues, NF; Barbosa, LS;
Publication
Beyond Program Slicing, 06.11. - 11.11.2005
Abstract
2005
Authors
Azevedo, PJ; Silva, CG; Rodrigues, JR; Loureiro Ferreira, N; Brito, RMM;
Publication
BIOLOGICAL AND MEDICAL DATA ANALYSIS, PROCEEDINGS
Abstract
One way of exploring protein unfolding events associated with the development of Amyloid diseases is through the use of multiple Molecular Dynamics Protein Unfolding Simulations. The analysis of the huge amount of data generated in these simulations is not a trivial task. In the present report, we demonstrate the use of Association Rules applied to the analysis of the variation profiles of the Solvent Accessible Surface Area of the 127 amino-acid residues of the protein Transthyretin, along multiple simulations. This allowed us to identify a set of 28 hydrophobic residues forming a hydrophobic cluster that might be essential in the unfolding and folding processes of Transthyretin.
2005
Authors
Ferreira, PG; Azevedo, PJ;
Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS
Abstract
We tackle the problem of sequence classification using relevant subsequences found in a dataset of protein labelled sequences. A subsequence is relevant if it is frequent and has a minimal length. For each query sequence a vector of features is obtained. The features consist in the number and average length of the relevant subsequences shared with each of the protein families. Classification is performed by combining these features in a Bayes Classifier. The combination of these characteristics results in a multi-class and multi-domain method that is exempt of data transformation and background knowledge. We illustrate the performance of our method using three collections of protein datasets. The performed tests showed that the method has an equivalent performance to state of the art methods in protein classification.
2005
Authors
Ferreira, PG; Azevedo, PJ;
Publication
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2005
Abstract
Considering the characteristics of biological sequence databases, which typically have a small alphabet, a very long length and a relative small size (several hundreds of sequences), we propose a new sequence mining algorithm (gIL). gIL was developed for linear sequence pattern mining and results from the combination of some of the most efficient techniques used in sequence and itemset mining. The algorithm exhibits a high adaptability, yielding a smooth and direct introduction of various types of features into the mining process, namely the extraction of rigid and arbitrary gap patterns. Both breadth or a depth first traversal are possible. The experimental evaluation, in synthetic and real life protein databases, has shown that our algorithm has superior performance to state-of-the art algorithms. The use of constraints has also proved to be a very useful tool to specify user interesting patterns.
2005
Authors
Ferreira, PG; Alves, R; Azevedo, PJ; Belo, O;
Publication
Actas de las X Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2005), September 14-16, 2005, Granada, Spain
Abstract
2005
Authors
Camacho, R; Alves, A; da Costa, JP; Azevedo, P;
Publication
2005 Portuguese Conference on Artificial Intelligence, Proceedings
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.