2007
Autores
Ong, IM; Topper, SE; Page, D; Costa, VS;
Publicação
Inductive Logic Programming
Abstract
Determining the underlying regulatory mechanism of genetic networks is one of the central challenges of computational biology. Numerous methods have been developed and applied to the important but complex task of reverse engineering regulatory networks from high-throughput gene expression data. However, many challenges remain. In this paper, we are interested in learning rules that will reveal the causal genes for the expression variation from various relational data sources in addition to gene expression data. Following our previous work where we showed that time series gene expression data could potentially uncover causal effects, we describe an application of an inductive logic programming (ILP) system, to the task of identifying important regulatory relationships from discretized time series gene expression data, protein-protein interaction, protein phosphorylation and transcription factor data about the organism. Specifically, we learn rules for predicting gene expression levels at the next time step based on the available relational data and then generalize the learned theory to visualize a pruned network of important interactions. We evaluate and present experimental results on microarray experiments from Gasch et al on Saccharomyces cerevisiae.
2007
Autores
Costa, VS; Sagonas, K; Lopes, R;
Publicação
Logic Programming, Proceedings
Abstract
As logic programming applications grow in size, Prolog systems need to efficiently access larger and larger data sets and the need for any- and multi-argument indexing becomes more and more profound. Static generation of multi-argument indexing is one alternative, but applications often rely on features that are inherently dynamic which makes static techniques inapplicable or inaccurate. Another alternative is to employ dynamic schemes for flexible demand-driven indexing of Prolog clauses. We propose such schemes and discuss issues that need to be addressed for their efficient implementation in the context of WAM-based Prolog systems. We have implemented demand-driven indexing in two different Prolog systems and have been able to obtain non-negligible performance speedups: from a few percent up to orders of magnitude. Given these results, we see very little reason for Prolog systems not to incorporate some form of dynamic indexing based on actual demand. In fact, we see demand-driven indexing as only the first step towards effective runtime optimization of Prolog programs.
2007
Autores
Bernardes, JulianaS.; Dávila, AlbertoM.R.; Costa, VitorSantos; Zaverucha, Gerson;
Publicação
CoRR
Abstract
2007
Autores
Costa, VS;
Publicação
PRACTICAL ASPECTS OF DECLARATIVE LANGUAGES
Abstract
Declarative systems, such as logic programming, should be ideal to process large data sets efficiently. Unfortunately, the high-level nature of logic-based representations can cause inefficiencies, and may lead in some cases to unacceptable performance. We discuss how logic programming systems can accommodate large amounts of data in main memory. We use a number of real datasets to evaluate performance and discuss how a number of techniques can be used to improve memory scalabality for such datasets.
2007
Autores
da Silva, AF; Costa, VS;
Publicação
Logic Programming, Proceedings
Abstract
We propose dynamic compilation for Prolog, in the style of Just-In-Time compilers. Our approach adapts to the actual characteristics of the target program by (i) compiling only the parts of the program that are executed frequently, and (ii) adapting to actual call patterns. This allows aggressive optimization of the parts of the program that are really executed, and better informed heuristics to drive these optimizations. Our compiler does need to support all features in the language, only what is deemed important to performance. Complex execution patterns, such as the ones caused by error handling, may be left to the interpreter. On the other hand, compilation is now part of the run-time, and thus incurs run-time overheads. We have implemented dynamic compilation for YAP system. Our initial results suggest that dynamic compilation achieves very substantial performance improvements over the original interpreter, and that it can approach and even out-perform state-of-the-art native code systems. We believe that we have shown that dynamic compilation is worthwhile and fits naturally with Prolog execution.
2007
Autores
Bernardes, JS; Davila, AM; Costa, VS; Zaverucha, G;
Publicação
BMC BIOINFORMATICS
Abstract
Background: Remote homology detection is a challenging problem in Bioinformatics. Arguably, profile Hidden Markov Models (pHMMs) are one of the most successful approaches in addressing this important problem. pHMM packages present a relatively small computational cost, and perform particularly well at recognizing remote homologies. This raises the question of whether structural alignments could impact the performance of pHMMs trained from proteins in the Twilight Zone, as structural alignments are often more accurate than sequence alignments at identifying motifs and functional residues. Next, we assess the impact of using structural alignments in pHMM performance. Results: We used the SCOP database to perform our experiments. Structural alignments were obtained using the 3DCOFFEE and MAMMOTH-mult tools; sequence alignments were obtained using CLUSTALW, TCOFFEE, MAFFT and PROBCONS. We performed leave-one-family-out cross-validation over super-families. Performance was evaluated through ROC curves and paired two tailed t-test. Conclusion: We observed that pHMMs derived from structural alignments performed significantly better than pHMMs derived from sequence alignment in low-identity regions, mainly below 20%. We believe this is because structural alignment tools are better at focusing on the important patterns that are more often conserved through evolution, resulting in higher quality pHMMs. On the other hand, sensitivity of these tools is still quite low for these low-identity regions. Our results suggest a number of possible directions for improvements in this area.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.