Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Sobre

Sobre

Pedro G. Ferreira graduated in Systems and Informatics Engineering (2002) and completed a PhD in Artificial Intelligence from University of Minho (2007). He was a Postdoctoral Fellow at Center for Genomic Regulation, Barcelona (2008-2012) and at University of Geneva (2012-2014). He participated in several major international consortia including ICGC-CLL, ENCODE, GEUVADIS and GTEx. Currently, he is an Assistant Professor at the Department of Computer Science, Faculty of Sciences of University of Porto and a researcher at INESCTEC-LIADD and i3s/Ipatimup. His main research focus is in genomic data science. In particular, he is interested in unraveling the role of genomics on the human health and disease. He has been involved in several bioinformatics start-ups.

Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    Pedro Gabriel Ferreira
  • Cluster

    Informática
  • Cargo

    Investigador Sénior
  • Desde

    20 setembro 2018
Publicações

2023

A systematic evaluation of deep learning methods for the prediction of drug synergy in cancer

Autores
Baptista, D; Ferreira, PG; Rocha, M;

Publicação
PLOS COMPUTATIONAL BIOLOGY

Abstract
Author summaryCancer therapies often fail because tumor cells become resistant to treatment. One way to overcome resistance is by treating patients with a combination of two or more drugs. Some combinations may be more effective than when considering individual drug effects, a phenomenon called drug synergy. Computational drug synergy prediction methods can help to identify new, clinically relevant drug combinations. In this study, we developed several deep learning models for drug synergy prediction. We examined the effect of using different types of deep learning architectures, and different ways of representing drugs and cancer cell lines. We explored the use of biological prior knowledge to select relevant cell line features, and also tested data-driven feature reduction methods. We tested both precomputed drug features and deep learning methods that can directly learn features from raw representations of molecules. We also evaluated whether including genomic features, in addition to gene expression data, improves the predictive performance of the models. Through these experiments, we were able to identify strategies that will help guide the development of new deep learning models for drug synergy prediction in the future. One of the main obstacles to the successful treatment of cancer is the phenomenon of drug resistance. A common strategy to overcome resistance is the use of combination therapies. However, the space of possibilities is huge and efficient search strategies are required. Machine Learning (ML) can be a useful tool for the discovery of novel, clinically relevant anti-cancer drug combinations. In particular, deep learning (DL) has become a popular choice for modeling drug combination effects. Here, we set out to examine the impact of different methodological choices on the performance of multimodal DL-based drug synergy prediction methods, including the use of different input data types, preprocessing steps and model architectures. Focusing on the NCI ALMANAC dataset, we found that feature selection based on prior biological knowledge has a positive impact-limiting gene expression data to cancer or drug response-specific genes improved performance. Drug features appeared to be more predictive of drug response, with a 41% increase in coefficient of determination (R-2) and 26% increase in Spearman correlation relative to a baseline model that used only cell line and drug identifiers. Molecular fingerprint-based drug representations performed slightly better than learned representations-ECFP4 fingerprints increased R-2 by 5.3% and Spearman correlation by 2.8% w.r.t the best learned representations. In general, fully connected feature-encoding subnetworks outperformed other architectures. DL outperformed other ML methods by more than 35% (R-2) and 14% (Spearman). Additionally, an ensemble combining the top DL and ML models improved performance by about 6.5% (R-2) and 4% (Spearman). Using a state-of-the-art interpretability method, we showed that DL models can learn to associate drug and cell line features with drug response in a biologically meaningful way. The strategies explored in this study will help to improve the development of computational methods for the rational design of effective drug combinations for cancer therapy.

2022

Scalable transcriptomics analysis with Dask: applications in data science and machine learning

Autores
Moreno, M; Vilaca, R; Ferreira, PG;

Publicação
BMC BIOINFORMATICS

Abstract
Background: Gene expression studies are an important tool in biological and biomedical research. The signal carried in expression profiles helps derive signatures for the prediction, diagnosis and prognosis of different diseases. Data science and specifically machine learning have many applications in gene expression analysis. However, as the dimensionality of genomics datasets grows, scalable solutions become necessary. Methods: In this paper we review the main steps and bottlenecks in machine learning pipelines, as well as the main concepts behind scalable data science including those of concurrent and parallel programming. We discuss the benefits of the Dask framework and how it can be integrated with the Python scientific environment to perform data analysis in computational biology and bioinformatics. Results: This review illustrates the role of Dask for boosting data science applications in different case studies. Detailed documentation and code on these procedures is made available at https:// github. com/martaccmoreno/gexp-ml-dask. Conclusion: By showing when and how Dask can be used in transcriptomics analysis, this review will serve as an entry point to help genomic data scientists develop more scalable data analysis procedures.

2021

Deep learning for drug response prediction in cancer

Autores
Baptista, D; Ferreira, PG; Rocha, M;

Publicação
BRIEFINGS IN BIOINFORMATICS

Abstract
Predicting the sensitivity of tumors to specific anti-cancer treatments is a challenge of paramount importance for precision medicine. Machine learning(ML) algorithms can be trained on high-throughput screening data to develop models that are able to predict the response of cancer cell lines and patients to novel drugs or drug combinations. Deep learning (DL) refers to a distinct class of ML algorithms that have achieved top-level performance in a variety of fields, including drug discovery. These types of models have unique characteristics that may make them more suitable for the complex task of modeling drug response based on both biological and chemical data, but the application of DL to drug response prediction has been unexplored until very recently. The few studies that have been published have shown promising results, and the use of DL for drug response prediction is beginning to attract greater interest from researchers in the field. In this article, we critically review recently published studies that have employed DL methods to predict drug response in cancer cell lines.We also provide a brief description of DL and the main types of architectures that have been used in these studies. Additionally, we present a selection of publicly available drug screening data resources that can be used to develop drug response prediction models. Finally, we also address the limitations of these approaches and provide a discussion on possible paths for further improvement.

2021

Population-scale tissue transcriptomics maps long non-coding RNAs to complex disease

Autores
de Goede, OM; Nachun, DC; Ferraro, NM; Gloudemans, MJ; Rao, AS; Smail, C; Eulalio, TY; Aguet, F; Ng, B; Xu, J; Barbeira, AN; Castel, SE; Kim-Hellmuth, S; Park, Y; Scott, AJ; Strober, BJ; Brown, CD; Wen, X; Hall, IM; Battle, A; Lappalainen, T; Im, HK; Ardlie, KG; Mostafavi, S; Quertermous, T; Kirkegaard, K; Montgomery, SB; Anand, S; Gabriel, S; Getz, GA; Graubert, A; Hadley, K; Handsaker, RE; Huang, KH; Li, X; MacArthur, DG; Meier, SR; Nedzel, JL; Nguyen, DT; Segrè, AV; Todres, E; Balliu, B; Bonazzola, R; Brown, A; Conrad, DF; Cotter, DJ; Cox, N; Das, S; Dermitzakis, ET; Einson, J; Engelhardt, BE; Eskin, E; Flynn, ED; Fresard, L; Gamazon, ER; Garrido-Martín, D; Gay, NR; Guigó, R; Hamel, AR; He, Y; Hoffman, PJ; Hormozdiari, F; Hou, L; Jo, B; Kasela, S; Kashin, S; Kellis, M; Kwong, A; Li, X; Liang, Y; Mangul, S; Mohammadi, P; Muñoz-Aguirre, M; Nobel, AB; Oliva, M; Park, Y; Parsana, P; Reverter, F; Rouhana, JM; Sabatti, C; Saha, A; Stephens, M; Stranger, BE; Teran, NA; Viñuela, A; Wang, G; Wright, F; Wucher, V; Zou, Y; Ferreira, PG; Li, G; Melé, M; Yeger-Lotem, E; Bradbury, D; Krubit, T; McLean, JA; Qi, L; Robinson, K; Roche, NV; Smith, AM; Tabor, DE; Undale, A; Bridge, J; Brigham, LE; Foster, BA; Gillard, BM; Hasz, R; Hunter, M; Johns, C; Johnson, M; Karasik, E; Kopen, G; Leinweber, WF; McDonald, A; Moser, MT; Myer, K; Ramsey, KD; Roe, B; Shad, S; Thomas, JA; Walters, G; Washington, M; Wheeler, J; Jewell, SD; Rohrer, DC; Valley, DR; Davis, DA; Mash, DC; Barcus, ME; Branton, PA; Sobin, L; Barker, LK; Gardiner, HM; Mosavel, M; Siminoff, LA; Flicek, P; Haeussler, M; Juettemann, T; Kent, WJ; Lee, CM; Powell, CC; Rosenbloom, KR; Ruffier, M; Sheppard, D; Taylor, K; Trevanion, SJ; Zerbino, DR; Abell, NS; Akey, J; Chen, L; Demanelis, K; Doherty, JA; Feinberg, AP; Hansen, KD; Hickey, PF; Jasmine, F; Jiang, L; Kaul, R; Kibriya, MG; Li, JB; Li, Q; Lin, S; Linder, SE; Pierce, BL; Rizzardi, LF; Skol, AD; Smith, KS; Snyder, M; Stamatoyannopoulos, J; Tang, H; Wang, M; Carithers, LJ; Guan, P; Koester, SE; Little, AR; Moore, HM; Nierras, CR; Rao, AK; Vaught, JB; Volpi, S;

Publicação
Cell

Abstract
Long non-coding RNA (lncRNA) genes have well-established and important impacts on molecular and cellular functions. However, among the thousands of lncRNA genes, it is still a major challenge to identify the subset with disease or trait relevance. To systematically characterize these lncRNA genes, we used Genotype Tissue Expression (GTEx) project v8 genetic and multi-tissue transcriptomic data to profile the expression, genetic regulation, cellular contexts, and trait associations of 14,100 lncRNA genes across 49 tissues for 101 distinct complex genetic traits. Using these approaches, we identified 1,432 lncRNA gene-trait associations, 800 of which were not explained by stronger effects of neighboring protein-coding genes. This included associations between lncRNA quantitative trait loci and inflammatory bowel disease, type 1 and type 2 diabetes, and coronary artery disease, as well as rare variant associations to body mass index.

2021

Solve-RD: systematic pan-European data sharing and collaborative analysis to solve rare diseases

Autores
Zurek, B; Ellwanger, K; Vissers, LELM; Schüle, R; Synofzik, M; Töpf, A; de Voer, RM; Laurie, S; Matalonga, L; Gilissen, C; Ossowski, S; ’t Hoen, PAC; Vitobello, A; Schulze Hentrich, JM; Riess, O; Brunner, HG; Brookes, AJ; Rath, A; Bonne, G; Gumus, G; Verloes, A; Hoogerbrugge, N; Evangelista, T; Harmuth, T; Swertz, M; Spalding, D; Hoischen, A; Beltran, S; Graessner, H; Haack, TB; Zurek, B; Ellwanger, K; Demidov, G; Sturm, M; Kessler, C; Wayand, M; Wilke, C; Traschütz, A; Schöls, L; Hengel, H; Heutink, P; Brunner, H; Scheffer, H; Steyaert, W; Sablauskas, K; de Voer, RM; Kamsteeg, E; van de Warrenburg, B; van Os, N; te Paske, I; Janssen, E; de Boer, E; Steehouwer, M; Yaldiz, B; Kleefstra, T; Veal, C; Gibson, S; Wadsley, M; Mehtarizadeh, M; Riaz, U; Warren, G; Dizjikan, FY; Shorter, T; Straub, V; Bettolo, CM; Specht, S; Clayton Smith, J; Banka, S; Alexander, E; Jackson, A; Faivre, L; Thauvin, C; Vitobello, A; Denommé Pichon, A; Duffourd, Y; Tisserant, E; Bruel, A; Peyron, C; Pélissier, A; Beltran, S; Gut, IG; Laurie, S; Piscia, D; Matalonga, L; Papakonstantinou, A; Bullich, G; Corvo, A; Garcia, C; Fernandez Callejo, M; Hernández, C; Picó, D; Paramonov, I; Lochmüller, H; Gumus, G; Bros Facer, V; Hanauer, M; Olry, A; Lagorce, D; Havrylenko, S; Izem, K; Rigour, F; Stevanin, G; Durr, A; Davoine, C; Guillot Noel, L; Heinzmann, A; Coarelli, G; Allamand, V; Nelson, I; Yaou, RB; Metay, C; Eymard, B; Cohen, E; Atalaia, A; Stojkovic, T; Macek, M; Turnovec, M; Thomasová, D; Kremliková, RP; Franková, V; Havlovicová, M; Kremlik, V; Parkinson, H; Keane, T; Senf, A; Robinson, P; Danis, D; Robert, G; Costa, A; Patch, C; Hanna, M; Houlden, H; Reilly, M; Vandrovcova, J; Muntoni, F; Zaharieva, I; Sarkozy, A; Timmerman, V; Baets, J; Van de Vondel, L; Beijer, D; de Jonghe, P; Nigro, V; Banfi, S; Torella, A; Musacchia, F; Piluso, G; Ferlini, A; Selvatici, R; Rossi, R; Neri, M; Aretz, S; Spier, I; Sommer, AK; Peters, S; Oliveira, C; Pelaez, JG; Matos, AR; José, CS; Ferreira, M; Gullo, I; Fernandes, S; Garrido, L; Ferreira, P; Carneiro, F; Swertz, MA; Johansson, L; van der Velde, JK; van der Vries, G; Neerincx, PB; Roelofs Prins, D; Köhler, S; Metcalfe, A; Verloes, A; Drunat, S; Rooryck, C; Trimouille, A; Castello, R; Morleo, M; Pinelli, M; Varavallo, A; De la Paz, MP; Sánchez, EB; Martín, EL; Delgado, BM; de la Rosa, FJAG; Ciolfi, A; Dallapiccola, B; Pizzi, S; Radio, FC; Tartaglia, M; Renieri, A; Benetti, E; Balicza, P; Molnar, MJ; Maver, A; Peterlin, B; Münchau, A; Lohmann, K; Herzog, R; Pauly, M; Macaya, A; Marcé Grau, A; Osorio, AN; de Benito, DN; Lochmüller, H; Thompson, R; Polavarapu, K; Beeson, D; Cossins, J; Cruz, PMR; Hackman, P; Johari, M; Savarese, M; Udd, B; Horvath, R; Capella, G; Valle, L; Holinski Feder, E; Laner, A; Steinke Lange, V; Schröck, E; Rump, A;

Publicação
EUROPEAN JOURNAL OF HUMAN GENETICS

Abstract
For the first time in Europe hundreds of rare disease (RD) experts team up to actively share and jointly analyse existing patient’s data. Solve-RD is a Horizon 2020-supported EU flagship project bringing together >300 clinicians, scientists, and patient representatives of 51 sites from 15 countries. Solve-RD is built upon a core group of four European Reference Networks (ERNs; ERN-ITHACA, ERN-RND, ERN-Euro NMD, ERN-GENTURIS) which annually see more than 270,000 RD patients with respective pathologies. The main ambition is to solve unsolved rare diseases for which a molecular cause is not yet known. This is achieved through an innovative clinical research environment that introduces novel ways to organise expertise and data. Two major approaches are being pursued (i) massive data re-analysis of >19,000 unsolved rare disease patients and (ii) novel combined -omics approaches. The minimum requirement to be eligible for the analysis activities is an inconclusive exome that can be shared with controlled access. The first preliminary data re-analysis has already diagnosed 255 cases form 8393 exomes/genome datasets. This unprecedented degree of collaboration focused on sharing of data and expertise shall identify many new disease genes and enable diagnosis of many so far undiagnosed patients from all over Europe.

Teses
supervisionadas

2022

New RNA signatures of therapy evasion in cancer

Autor
Ana Filipa Pacheco Fonseca Lopes de Mendonça

Instituição
UP-FCUP

2022

Understanding the impact of Mycobacterium tuberculosis complex diversity in tuberculosis

Autor
Ana Raquel Maceiras de Oliveira

Instituição
UP-FCUP

2022

When pathways cross other pathways, that themselves, have met other pathways: crabs responses to cocktails of contaminants

Autor
Sara Sousa Moreira

Instituição
UP-FCUP

2022

Modelling and Predicting Acute Ischaemic Stroke Outcomes

Autor
Tiago Filipe dos Santos

Instituição
UP-FCUP

2022

Transcriptomics-based prediction of human phenotypes using scalable and secure machine learning approaches

Autor
Marta Carolina Cabral Moreno

Instituição
UP-FCUP