Publicacoes - INESC TEC

Publicações

Publicações por Ricardo Rocha

2006

RepeatAround: A software tool for finding and visualizing repeats in circular genomes and its application to a human mtDNA database

Autores
Goios, A; Meirinhos, J; Rocha, R; Lopes, R; Amorim, A; Pereira, L;

Publicação
MITOCHONDRION

Abstract
RepeatAround is a Windows based software tool designed to find "direct repeats", "inverted repeats", "mirror repeats" and "complementary repeats", from 3 to 64 bp length, in circular genomes. It processes input files directly extracted from GenBank database, providing visualisation of the repeats location in the genomic structure, so that for instance, in most mtDNAs the user can check if the repeats are located in coding or non-coding region (and in the first case in which gene), and how far apart the repeat pair(s) are. Besides the visual tool, it provides other outputs in a spreadsheet containing information on the number and location of the repeats, facilitating graphic analyses. Several genomes can be inputed simultaneously, for phylogenetic comparison purposes. Other capabilities of the software are the generation of random circular genomes, for statistical evaluation of comparison between observed repeats distributions with their shuffled counterparts, as well as the search for specific motifs, allowing an easy confirmation of repeats flanking a newly detected rearrangement. As an example of the programme's applications we analysed the Direct Repeats distribution in a large human mtDNA database. Results showed that Direct Repeats, even the larger ones, are evenly distributed among the human mtDNA haplogroups, enabling us to state that, based only on the repetitive motifs, no haplogroup is particularly more or less prone to mtDNA macrodeletions.

FecharLer Abstract

2009

The Diversity Present in 5140 Human Mitochondrial Genomes

Autores
Pereira, L; Freitas, F; Fernandes, V; Pereira, JB; Costa, MD; Costa, S; Maximo, V; Macaulay, V; Rocha, R; Samuels, DC;

Publicação
AMERICAN JOURNAL OF HUMAN GENETICS

Abstract
We analyzed the current status (as of the end of August 2008) of human mitochondrial genomes deposited in GenBank, amounting to 5140 complete or coding-region sequences, in order to present an overall picture of the diversity present in the mitochondrial DNA of the global human population. To perform this task, we developed mtDNA-GeneSyn, a computer tool that identifies and exhaustedly classifies the diversity present in large genetic data sets. The diversity observed in the 5140 human mitochondrial genomes was compared with all possible transitions and transversions from the standard human mitochondrial reference genome. This comparison showed that tRNA and rRNA secondary structures have a large effect in limiting the diversity of the human mitochondrial sequences, whereas for the protein-coding genes there is a bias toward less variation at the second codon positions. The analysis of the observed amino acid variations showed a tolerance of variations that convert between the amino acids V, 1, A, M, and T. This defines a group of amino acids with similar chemical properties that can interconvert by a single transition.

FecharLer Abstract

2011

A Subterm-Based Global Trie for Tabled Evaluation of Logic Programs

Autores
Raimundo, J; Rocha, R;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
Tabling is an implementation technique that overcomes some limitations of traditional Prolog systems in dealing with redundant sub-computations and recursion. A critical component in the implementation of an efficient tabling system is the design of the table space. The most popular and successful data structure for representing tables is based on a two-level trie data structure, where one trie level stores the tabled subgoal calls and the other stores the computed answers. The Global Trie (GT) is an alternative table space organization designed with the intent to reduce the tables's memory usage, namely by storing terms in a global trie, thus preventing repeated representations of the same term in different trie data structures. In this paper, we propose an extension to the GT organization, named Global Trie for Subterms (GT-ST), where compound subterms in term arguments are represented as unique entries in the GT. Experimental results using the Yap Tab tabling system show that GT-ST support has potential to achieve significant reductions on memory usage, for programs with increasing compound subterms in term arguments, without compromising the execution time for other programs.

FecharLer Abstract

2010

Preprocessing Boolean Formulae for BDDs in a Probabilistic Context

Autores
Mantadelis, T; Rocha, R; Kimmig, A; Janssens, G;

Publicação
LOGICS IN ARTIFICIAL INTELLIGENCE, JELIA 2010

Abstract
Inference in many probabilistic logic systems is based on representing the proofs of a query as a DNF Boolean formula. Assessing the probability of such a formula is known as a #P-hard task. In practice, a large DNF is given to a BDD software package to construct the corresponding BDD. The DNF has to be transformed into the input format of the package. This is the preprocessing step. In this paper we investigate and compare different preprocessing methods, including our new trie based approach. Our experiments within the ProbLog system show that the behaviour of the methods changes according to the amount of sharing in the original DNF. The decomposition method is preferred when there is not much sharing in the DNF, whereas DNFs with sharing benefit from our trie based method. While our methods are motivated and applied in the ProbLog context, our results are interesting for other applications that manipulate DNF Boolean formulae.

FecharLer Abstract

2010

Retroactive Subsumption-Based Tabled Evaluation of Logic Programs

Autores
Cruz, F; Rocha, R;

Publicação
LOGICS IN ARTIFICIAL INTELLIGENCE, JELIA 2010

Abstract
Tabled evaluation is a recognized and powerful implementation technique that overcomes some limitations of traditional Prolog systems in dealing with recursion and redundant sub-computations. Tabling based systems use call similarity to determine if a tabled subgoal will produce their own answers or if it will consume from another subgoal. While call variance has been a very popular approach, call subsumption can yield superior time performance and space improvements as it allows greater reuse of answers. However, the call order of the subgoals can greatly affect the success and applicability of the call subsumption technique. In this work, we present an extension, named Retroactive Call Subsumption, that supports call subsumption by allowing full sharing of answers between subsumed/subsuming subgoals, independently on the order in which they are called. Our experiments using the YapTab tabling engine show considerable gains in evaluation time for some applications, at the expense of a very small overhead for the programs that cannot benefit from it.

FecharLer Abstract

2010

An Efficient Implementation of Linear Tabling Based on Dynamic Reordering of Alternatives

Autores
Areias, M; Rocha, R;

Publicação
PRACTICAL ASPECTS OF DECLARATIVE LANGUAGES, PROCEEDINGS

Abstract
Tabling is a technique of resolution that overcomes some limitations of traditional Prolog systems in dealing with recursion and redundant sub-computations. We can distinguish two main categories of tabling mechanisms: suspension-based tabling and linear tabling. In suspension-based tabling, a tabled evaluation can be seen as a sequence of sub-computations that suspend and later resume. Linear tabling mechanisms maintain a single execution tree where tabled subgoals always extend the current computation without requiring suspension and resumption of sub-computations. In this work, we present a new and efficient implementation of linear tabling, but for that we have extended an already existent suspension-based implementation, the YapTab engine. Our design is based on dynamic reordering of alternatives but it innovates by considering a strategy that schedules the re-evaluation of tabled calls in a similar manner to the suspension-based strategies of YapTab. Our implementation also shares the underlying execution environment and most of the data structures used to implement tabling in YapTab. We thus argue that all these common features allows us to make a first and fair comparison between suspension-based and linear tabling and, therefore, better understand the advantages and weaknesses of each.

FecharLer Abstract