Publications

Publications by Nuno Fonseca

2004

On avoiding redundancy in inductive logic programming

Authors
Fonseca, N; Costa, VS; Silva, F; Camacho, R;

Publication
INDUCTIVE LOGIC PROGRAMMING, PROCEEDINGS

Abstract
ILP systems induce first-order clausal theories performing a search through very large hypotheses spaces containing redundant hypotheses. The generation of redundant hypotheses may prevent the systems from finding good models and increases the time to induce them. In this paper we propose a classification of hypotheses redundancy and show how expert knowledge can be provided to an ILP system to avoid it. Experimental results show that the number of hypotheses generated and execution time are reduced when expert knowledge is used to avoid redundancy.

CloseRead Abstract

2005

Strategies to parallelize ILP systems

Authors
Fonseca, NA; Silva, F; Camacho, R;

Publication
INDUCTIVE LOGIC PROGRAMMING, PROCEEDINGS

Abstract
It is well known by Inductive Logic Programming (ILP) practioners that ILP systems usually take a long time to find valuable models (theories). The problem is specially critical for large datasets, preventing ILP systems to scale up to larger applications. One approach to reduce the execution time has been the parallelization of ILP systems. In this paper we overview the state-of-the-art on parallel ILP implementations and present work on the evaluation of some major parallelization strategies for ILP. Conclusions about the applicability of each strategy are presented.

CloseRead Abstract

2017

Discovery and characterization of coding and non-coding driver mutations in more than 2,500 whole cancer genomes

Authors
Rheinbay, E; Nielsen, MM; Abascal, F; Tiao, G; Hornshøj, H; Hess, JM; Pedersen, RI; Feuerbach, L; Sabarinathan, R; Madsen, T; Kim, J; Mularoni, L; Shuai, S; Lanzós, A; Herrmann, C; Maruvka, YE; Shen, C; Amin, SB; Bertl, J; Dhingra, P; Diamanti, K; Gonzalez-Perez, A; Guo, Q; Haradhvala, NJ; Isaev, K; Juul, M; Komorowski, J; Kumar, S; Lee, D; Lochovsky, L; Liu, EM; Pich, O; Tamborero, D; Umer, HM; Uusküla-Reimand, L; Wadelius, C; Wadi, L; Zhang, J; Boroevich, KA; Carlevaro-Fita, J; Chakravarty, D; Chan, CW; Fonseca, NA; Hamilton, MP; Hong, C; Kahles, A; Kim, Y; Lehmann, K; Johnson, TA; Kahraman, A; Park, K; Saksena, G; Sieverling, L; Sinnott-Armstrong, NA; Campbell, PJ; Hobolth, A; Kellis, M; Lawrence, MS; Raphael, B; Rubin, MA; Sander, C; Stein, L; Stuart, J; Tsunoda, T; Wheeler, DA; Johnson, R; Reimand, J; Gerstein, MB; Khurana, E; López-Bigas, N; Martincorena, I; Pedersen, JS; Getz, G;

Publication

Abstract
AbstractDiscovery of cancer drivers has traditionally focused on the identification of protein-coding genes. Here we present a comprehensive analysis of putative cancer driver mutations in both protein-coding and non-coding genomic regions across >2,500 whole cancer genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. We developed a statistically rigorous strategy for combining significance levels from multiple driver discovery methods and demonstrate that the integrated results overcome limitations of individual methods. We combined this strategy with careful filtering and applied it to protein-coding genes, promoters, untranslated regions (UTRs), distal enhancers and non-coding RNAs. These analyses redefine the landscape of non-coding driver mutations in cancer genomes, confirming a few previously reported elements and raising doubts about others, while identifying novel candidate elements across 27 cancer types. Novel recurrent events were found in the promoters or 5’UTRs ofTP53, RFTN1, RNF34,andMTG2,in the 3’UTRs ofNFKBIZandTOB1,and in the non-coding RNARMRP.We provide evidence that the previously reported non-coding RNAsNEAT1andMALAT1may be subject to a localized mutational process. Perhaps the most striking finding is the relative paucity of point mutations driving cancer in non-coding genes and regulatory elements. Though we have limited power to discover infrequent non-coding drivers in individual cohorts, combined analysis of promoters of known cancer genes show little excess of mutations beyondTERT.

CloseRead Abstract

2017

Genomic basis for RNA alterations revealed by whole-genome analyses of 27 cancer types

Authors
Calabrese, C; Davidson, NR; Fonseca, NA; He, Y; Kahles, A; Lehmann, K; Liu, F; Shiraishi, Y; Soulette, CM; Urban, L; Demircioglu, D; Greger, L; Li, S; Liu, D; Perry, MD; Xiang, L; Zhang, F; Zhang, J; Bailey, P; Erkek, S; Hoadley, KA; Hou, Y; Kilpinen, H; Korbel, JO; Marin, MG; Markowski, J; Nandi, T; Pan-Hammarström, Q; Pedamallu, CS; Siebert, R; Stark, SG; Su, H; Tan, P; Waszak, SM; Yung, C; Zhu, S; Awadalla, P; Creighton, CJ; Meyerson, M; Ouellette, BF; Wu, K; Yang, H; Brazma, A; Brooks, AN; Göke, J; Rätsch, G; Schwarz, RF; Stegle, O; Zhang, Z;

Publication

Abstract
AbstractWe present the most comprehensive catalogue of cancer-associated gene alterations through characterization of tumor transcriptomes from 1,188 donors of the Pan-Cancer Analysis of Whole Genomes project. Using matched whole-genome sequencing data, we attributed RNA alterations to germline and somatic DNA alterations, revealing likely genetic mechanisms. We identified 444 associations of gene expression with somatic non-coding single-nucleotide variants. We found 1,872 splicing alterations associated with somatic mutation in intronic regions, including novel exonization events associated with Alu elements. Somatic copy number alterations were the major driver of total gene and allele-specific expression (ASE) variation. Additionally, 82% of gene fusions had structural variant support, including 75 of a novel class called “bridged” fusions, in which a third genomic location bridged two different genes. Globally, we observe transcriptomic alteration signatures that differ between cancer types and have associations with DNA mutational signatures. Given this unique dataset of RNA alterations, we also identified 1,012 genes significantly altered through both DNA and RNA mechanisms. Our study represents an extensive catalog of RNA alterations and reveals new insights into the heterogeneous molecular mechanisms of cancer gene alterations.

CloseRead Abstract

2017

A Pan-Cancer Transcriptome Analysis Reveals Pervasive Regulation through Tumor-Associated Alternative Promoters

Authors
Demircioglu, D; Kindermans, M; Nandi, T; Cukuroglu, E; Calabrese, C; Fonseca, NA; Kahles, A; Lehmann, K; Stegle, O; Brazma, A; Brooks, AN; Rätsch, G; Tan, P; Göke, J;

Publication

Abstract
ABSTRACTMost human protein-coding genes are regulated by multiple, distinct promoters, suggesting that the choice of promoter is as important as its level of transcriptional activity. While the role of promoters as driver elements in cancer has been recognized, the contribution of alternative promoters to regulation of the cancer transcriptome remains largely unexplored. Here we infer active promoters using RNA-Seq data from 1,188 cancer samples with matched whole genome sequencing data. We find that alternative promoters are a major contributor to context-specific regulation of isoform expression and that alternative promoters are frequently deregulated in cancer, affecting known cancer-genes and novel candidates. Our study suggests that a highly dynamic landscape of active promoters shapes the cancer transcriptome, opening many opportunities to further explore the interplay of regulatory mechanism and noncoding somatic mutations with transcriptional aberrations in cancer.

CloseRead Abstract

2017

Large-Scale Uniform Analysis of Cancer Whole Genomes in Multiple Computing Environments

Authors
Yung, CK; O’Connor, BD; Yakneen, S; Zhang, J; Ellrott, K; Kleinheinz, K; Miyoshi, N; Raine, KM; Royo, R; Saksena, GB; Schlesner, M; Shorser, SI; Vazquez, M; Weischenfeldt, J; Yuen, D; Butler, AP; Davis-Dusenbery, BN; Eils, R; Ferretti, V; Grossman, RL; Harismendy, O; Kim, Y; Nakagawa, H; Newhouse, SJ; Torrents, D; Stein, LD; Rodriguez, JB; Boroevich, KA; Boyce, R; Brooks, AN; Buchanan, A; Buchhalter, I; Byrne, NJ; Cafferkey, A; Campbell, PJ; Chen, Z; Cho, S; Choi, W; Clapham, P; De La Vega, FM; Demeulemeester, J; Dow, MT; Dursi, LJ; Eils, J; Farcas, C; Favero, F; Fayzullaev, N; Flicek, P; Fonseca, NA; Gelpi, JL; Getz, G; Gibson, B; Heinold, MC; Hess, JM; Hofmann, O; Hong, JH; Hudson, TJ; Huebschmann, D; Hutter, B; Hutter, CM; Imoto, S; Ivkovic, S; Jeon, S; Jiao, W; Jung, J; Kabbe, R; Kahles, A; Kerssemakers, J; Kim, H; Kim, H; Kim, J; Korbel, JO; Koscher, M; Koures, A; Kovacevic, M; Lawerenz, C; Leshchiner, I; Livitz, DG; Mihaiescu, GL; Mijalkovic, S; Lazic, AM; Miyano, S; Nahal, HK; Nastic, M; Nicholson, J; Ocana, D; Ohi, K; Ohno-Machado, L; Omberg, L; Francis Ouellette, B; Paramasivam, N; Perry, MD; Pihl, TD; Prinz, M; Puiggròs, M; Radovic, P; Rheinbay, E; Rosenberg, MW; Short, C; Sofia, HJ; Spring, J; Struck, AJ; Tiao, G; Tijanic, N; Loo, PV; Vicente, D; Wala, JA; Wang, Z; Werner, J; Williams, A; Woo, Y; Wright, AJ; Xiang, Q;

Publication

Abstract
AbstractThe International Cancer Genome Consortium (ICGC)’s Pan-Cancer Analysis of Whole Genomes (PCAWG) project aimed to categorize somatic and germline variations in both coding and non-coding regions in over 2,800 cancer patients. To provide this dataset to the research working groups for downstream analysis, the PCAWG Technical Working Group marshalled ~800TB of sequencing data from distributed geographical locations; developed portable software for uniform alignment, variant calling, artifact filtering and variant merging; performed the analysis in a geographically and technologically disparate collection of compute environments; and disseminated high-quality validated consensus variants to the working groups. The PCAWG dataset has been mirrored to multiple repositories and can be located using the ICGC Data Portal. The PCAWG workflows are also available as Docker images through Dockstore enabling researchers to replicate our analysis on their own data.

CloseRead Abstract