Publicacoes - INESC TEC

Publicações

Publicações por LIAAD

2008

The number, age, sharing and relatedness of S-locus specificities in Prunus

Autores
Vieira, J; Fonseca, NA; Santos, RAM; Habu, T; Tao, R; Vieira, CP;

Publicação
GENETICS RESEARCH

Abstract
In gametophytic self-incompatibility systems, many specificities (different 'lock-and-key' combinations) are maintained by frequency-dependent selection for very long evolutionary times. In Solanaceae, trans-specific evolution (the observation that an allele from one species may be more closely related to an allele from another species than to others from the same species) has been taken as an argument for the very old age of specificities. In this work, by determining, for the first time, the age of extant Prunus species, we show that this reasoning cannot be applied to Prunoideae. Furthermore, since our sample size is large (all S-RNase encoding the female component and SFB encoding the male component GenBank sequences), we were able to estimate the age of the oldest Prunus specificities. By doing so, we show that the lower variability levels at the Prunus S-locus, in comparison with Solanaceae, is due to the younger age of Prunus alleles, and not to a difference in silent mutation rates. We show that the ancestor to extant Prunus species harboured at least 102 specificities, in contrast to the maximum of 33 observed in extant Prunus species. Since the number of specificities that can be maintained in a population depends on the effective population size, this observation suggests a bottleneck in Prunus evolutionary history. Loss of specificities may have occurred during this event. Using only information on amino acid sites that determine specificity differences, and a simulation approach, we show that a model that assumes closely related specificities are not preferentially lost during evolution, fails to predict the observed degree of specificity relatedness.

FecharLer Abstract

2008

Protein evolution of ANTP and PRD homeobox genes

Autores
Fonseca, NA; Vieira, CP; Holland, PWH; Vieira, J;

Publicação
BMC EVOLUTIONARY BIOLOGY

Abstract
Background: Although homeobox genes have been the subject of many studies, little is known about the main amino acid changes that occurred early in the evolution of genes belonging to different classes. Results: In this study, we report a method for the fast and efficient retrieval of sequences belonging to the ANTP (HOXL and NKL) and PRD classes. Furthermore, we look for diagnostic amino acid residues that can be used to distinguish HOXL, NKL and PRD genes. Conclusion: The reported protein features will facilitate the robust classification of homeobox genes from newly sequenced bilaterian genomes. Nevertheless, in non-bilaterian genomes our findings must be cautiously applied. In principle, as long as a good manually curated data set is available the approach here described can be applied to non-bilaterian organisms as well. Our results help focus experimental studies onto investigating the biochemical functions of key homeodomain residues in different gene classes.

FecharLer Abstract

2008

An S-RNase-based gametophytic self-incompatibility system evolved only once in eudicots

Autores
Vieira, J; Fonseca, NA; Vieira, CP;

Publicação
JOURNAL OF MOLECULAR EVOLUTION

Abstract
It has been argued that the common ancestor of about 75% of all dicots possessed an S-RNase-based gametophytic self-incompatibility (GSI) system. S-RNase genes should thus be found in most plant families showing GSI. The S-RNase gene (or a duplicate) may also acquire a new function and thus genes belonging to the S-RNase lineage may also persist in plant families without GSI. Nevertheless, sequences that belong to the S-RNase lineage have been found in the Solanaceae, Scrophulariaceae, Rosaceae, Cucurbitaceae, and Fabaceae plant families only. Here we search for new sequences that may belong to the S-RNase lineage, using both a phylogenetic and a much faster and simpler amino acid pattern-based approach. We show that the two methods have an apparently similar false-negative rate of discovery (similar to 10%). The amino acid pattern-based approach produces about 15% false positives. Genes belonging to the S-RNase lineage are found in three new plant families, namely, the Rubiaceae, Euphorbiaceae, and Malvaceae. Acquisition of a new function by genes belonging to the S-RNase lineage is shown to be a frequent event. A putative S-RNase sequence is identified in Lotus, a plant genus for which molecular studies on GSI are lacking. The hypothesis of a single origin for S-RNase-based GSI (before the split of the Asteridae and Rosidae) is further supported by the finding of genes belonging to the S-RNase lineage in some of the oldest lineages of the Asteridae and Rosidae, and by Baysean constrained tree analyses.

FecharLer Abstract

2008

Amino acid pairing at the N- and C-termini of helical segments in proteins

Autores
Fonseca, NA; Camacho, R; Magalhaes, AL;

Publicação
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS

Abstract
A systematic survey was carried out in an unbiased sample of 815 protein chains with a maximum of 20% homology selected from the Protein Data Bank, whose structures were solved at a resolution higher than 1.6 angstrom and with a R-factor lower than 25%. A set of 5556 subsequences with a-helix or 3(10)-helix motifs was extracted from the protein chains considered. Global and local propensities were then calculated for all possible amino acid pairs of the type (i, i + 1), (i, i + 2), (i, i + 3), and (i, i + 4), starting at the relevant helical positions N1, N2, N3, C3, C2, C1, and N-int (interior positions), and also at the first nonhelical positions in both termini of the helices, namely, N-cap and C-cap. The statistical analysis of the propensity values has shown that pairing is significantly dependent on the type of the amino acids and on the position of the pair. A few sequences of three and four amino acids were selected and their high prevalence in helices is outlined in this work. The Glu-Lys-Tyr-Pro sequence shows a peculiar distribution in proteins, which may suggest a relevant structural role in alpha-helices when Pro is located at the C-cap position. A bioinformatics tool was developed, which updates automatically and periodically the results and makes them available in a web site.

FecharLer Abstract

2008

LogCHEM: Interactive Discriminative Mining of Chemical Structure

Autores
Costa, VS; Fonseca, NA; Camacho, R;

Publicação
2008 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, PROCEEDINGS

Abstract
One of the most well known successes of Inductive Logic Programming (ILP) is on Structure-Activity Relationship (SAR) problems. In such problems, ILP has proved several times to be capable of constructing expert comprehensible models that hell) to explain the activity of chemical compounds based on their structure and properties. However, despite its successes on SAR problems, ILP has severe scalability problems that prevent its application oil larger datasets. In this paper we present LogCHEM, an ILP based tool for discriminative interactive mining of chemical fragments. LogCHEM tackles ILP's scalability issues in the context of SAR applications. We show that LogCHEM benefits from the flexibility of ILP both by its ability to quickly extend the original mining model, and by its ability, to interface with external tools. Furthermore, We demonstrate that LogCHEM can be used to mine effectively large chemoinformatics datasets, namely, several datasets from EPA's DSSTox database and on a dataset based on the DTP AIDS anti-viral screen.

FecharLer Abstract

2008

Induction as a search procedure

Autores
Konstantopoulos, S; Camacho, R; Fonseca, NA; Costa, VS;

Publicação
Artificial Intelligence for Advanced Problem Solving Techniques

Abstract
This chapter introduces inductive logic programming (ILP) from the perspective of search algorithms in computer science. It first briefly considers the version spaces approach to induction, and then focuses on inductive logic programming: from its formal definition and main techniques and strategies, to priors used to restrict the search space and optimized sequential, parallel, and stochastic algorithms. The authors hope that this presentation of the theory and applications of inductive logic programming will help the reader understand the theoretical underpinnings of ILP, and also provide a helpful overview of the State-of-the-Art in the domain. © 2008, IGI Global.

FecharLer Abstract