Publicacoes - INESC TEC

Publicações

Publicações por LIAAD

2014

AND Parallelism for ILP: The APIS System

Autores
Camacho, R; Ramos, R; Fonseca, NA;

Publicação
INDUCTIVE LOGIC PROGRAMMING: 23RD INTERNATIONAL CONFERENCE

Abstract
Inductive Logic Programming (ILP) is a well known approach to Multi-Relational Data Mining. ILP systems may take a long time for analyzing the data mainly because the search (hypotheses) spaces are often very large and the evaluation of each hypothesis, which involves theorem proving, may be quite time consuming in some domains. To address these efficiency issues of ILP systems we propose the APIS (And ParallelISm for ILP) system that uses results from Logic Programming AND-parallelism. The approach enables the partition of the search space into sub-spaces of two kinds: sub-spaces where clause evaluation requires theorem proving; and sub-spaces where clause evaluation is performed quite efficiently without resorting to a theorem prover. We have also defined a new type of redundancy (Coverage-equivalent redundancy) that enables the prune of significant parts of the search space. The new type of pruning together with the partition of the hypothesis space considerably improved the performance of the APIS system. An empirical evaluation of the APIS system in standard ILP data sets shows considerable speedups without a lost of accuracy of the models constructed.

FecharLer Abstract

2014

Expression Atlas update-a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments

Autores
Petryszak, R; Burdett, T; Fiorelli, B; Fonseca, NA; Gonzalez Porta, M; Hastings, E; Huber, W; Jupp, S; Keays, M; Kryvych, N; McMurry, J; Marioni, JC; Malone, J; Megy, K; Rustici, G; Tang, AY; Taubert, J; Williams, E; Mannion, O; Parkinson, HE; Brazma, A;

Publicação
NUCLEIC ACIDS RESEARCH

Abstract
Expression Atlas (http://www.ebi.ac.uk/gxa) is a value-added database providing information about gene, protein and splice variant expression in different cell types, organism parts, developmental stages, diseases and other biological and experimental conditions. The database consists of selected high-quality microarray and RNA-sequencing experiments from ArrayExpress that have been manually curated, annotated with Experimental Factor Ontology terms and processed using standardized microarray and RNA-sequencing analysis methods. The new version of Expression Atlas introduces the concept of 'baseline' expression, i.e. gene and splice variant abundance levels in healthy or untreated conditions, such as tissues or cell types. Differential gene expression data benefit from an in-depth curation of experimental intent, resulting in biologically meaningful 'contrasts', i.e. instances of differential pairwise comparisons between two sets of biological replicates. Other novel aspects of Expression Atlas are its strict quality control of raw experimental data, up-to-date RNA-sequencing analysis methods, expression data at the level of gene sets, as well as genes and a more powerful search interface designed to maximize the biological value provided to the user.

FecharLer Abstract

2014

RNA-Seq Gene Profiling - A Systematic Empirical Comparison

Autores
Fonseca, NA; Marioni, J; Brazma, A;

Publicação
PLOS ONE

Abstract
Accurately quantifying gene expression levels is a key goal of experiments using RNA-sequencing to assay the transcriptome. This typically requires aligning the short reads generated to the genome or transcriptome before quantifying expression of pre-defined sets of genes. Differences in the alignment/quantification tools can have a major effect upon the expression levels found with important consequences for biological interpretation. Here we address two main issues: do different analysis pipelines affect the gene expression levels inferred from RNA-seq data? And, how close are the expression levels inferred to the "true" expression levels? We evaluate fifty gene profiling pipelines in experimental and simulated data sets with different characteristics (e. g, read length and sequencing depth). In the absence of knowledge of the 'ground truth' in real RNAseq data sets, we used simulated data to assess the differences between the "true" expression and those reconstructed by the analysis pipelines. Even though this approach does not take into account all known biases present in RNAseq data, it still allows to estimate the accuracy of the gene expression values inferred by different analysis pipelines. The results show that i) overall there is a high correlation between the expression levels inferred by the best pipelines and the true quantification values; ii) the error in the estimated gene expression values can vary considerably across genes; and iii) a small set of genes have expression estimates with consistently high error (across data sets and methods). Finally, although the mapping software is important, the quantification method makes a greater difference to the results.

FecharLer Abstract

2014

Long-range enhancers regulating Myc expression are required for normal facial morphogenesis

Autores
Uslu, VV; Petretich, M; Ruf, S; Langenfeld, K; Fonseca, NA; Marioni, JC; Spitz, F;

Publicação
NATURE GENETICS

Abstract
Cleft lip with or without cleft palate (CL/P) is one of the most common congenital malformations observed in humans, with 1 occurrence in every 500-1,000 births(1,2). A 640-kb noncoding interval at 8q24 has been associated with increased risk of non-syndromic CL/P in humans(3-5), but the genes and pathways involved in this genetic susceptibility have remained elusive. Using a large series of rearrangements engineered over the syntenic mouse region, we show that this interval contains very remote cis-acting enhancers that control Myc expression in the developing face. Deletion of this interval leads to mild alteration of facial morphology in mice and, sporadically, to CUP. At the molecular level, we identify misexpression of several downstream genes, highlighting combined impact on the craniofacial developmental network and the general metabolic capacity of cells contributing to the future upper lip. This dual molecular etiology may account for the prominent influence of variants in the 8q24 region on human facial dysmorphologies.

FecharLer Abstract

2014

iRAP - an integrated RNA-seq Analysis Pipeline

Autores
Fonseca, NA; Petryszak, R; Marioni, J; Brazma, A;

Publicação

Abstract
RNA-sequencing (RNA-Seq) has become the technology of choice for whole-transcriptome profiling. However, processing the millions of sequence reads generated requires considerable bioinformatics skills and computational resources. At each step of the processing pipeline many tools are available, each with specific advantages and disadvantages. While using a specific combination of tools might be desirable, integrating the different tools can be time consuming, often due to specificities in the formats of input/output files required by the different programs. Here we present iRAP, an integrated RNA-seq analysis pipeline that allows the user to select and apply their preferred combination of existing tools for mapping reads, quantifying expression, testing for differential expression. iRAP also includes multiple tools for gene set enrichment analysis and generates web browsable reports of the results obtained in the different stages of the pipeline. Depending upon the application, iRAP can be used to quantify expression at the gene, exon or transcript level. iRAP is aimed at a broad group of users with basic bioinformatics training and requires little experience with the command line. Despite this, it also provides more advanced users with the ability to customise the options used by their chosen tools.

FecharLer Abstract

2014

Need and requirements elicitation for electronic access to patient's medication history in the emergency department

Autores
David, M; Rosa, F; Rodrigues, PP;

Publicação
2014 IEEE 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS)

Abstract
Electronic access to patient's medication history (PMH) in the emergency department (ED) in Portugal is not widely granted, nor has the importance of such access been clearly assessed. Given the known association between poor PMH and medication errors, the goal of this study was to gather requirements for such a system, assessing physicians' opinions regarding the importance of having access to PMH in the ED. A questionnaire was sent to all Portuguese public hospitals which approved the study, and forwarded by email by the internal services of each hospital to ED physicians. Fourteen hospitals authorized the study, from which 83 ED physicians answered the questionnaire. PMH-related information considered most important focused on medication name and posology (> 90%) and date and dose of prescription (> 80%), but also date of dispensing of medications (> 40%). Other information such as allergies (99%) and adverse reactions (96%) were similarly considered important, and physicians agree with the inclusion of nonprescription medications (85%) as well as homeopathic medicines (64%). Overall, access to PMH in the ED appears to be important and present benefits to patients' care. Given this, electronic access to PHM should be settled in Portuguese ED.

FecharLer Abstract