Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Pedro Gabriel Ferreira

2025

AdhesionScore: A Prognostic Predictor of Breast Cancer Patients Based on a Cell Adhesion-Associated Gene Signature

Authors
Esquível, C; Ribeiro, R; Ribeiro, AS; Ferreira, PG; Paredes, J;

Publication
CANCERS

Abstract
Background: Aberrant or loss of cell adhesion drives invasion and metastasis, key hallmarks of cancer progression. In this work, we hypothesized that a gene signature related to cell adhesion could predict breast cancer prognosis. Methods: Highly variant genes were tested for association with overall survival using Cox regression. Adhesion-related genes were identified through gene ontology analysis and multivariate Cox regression, with AIC selection, defined the prognostic signature. The AdhesionScore was then calculated as a weighted sum of gene expression, with risk stratification assessed by Kaplan-Meier and log-rank tests. Results: We found that the AdhesionScore was a significant independent predictor of poor survival in three large independent datasets, as it provided a robust stratification of patient prognosis in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) (HR: 2.65; 95% CI: 2.33-3.0, p = 2.34 x 10-51), The Cancer Genome Atlas (TCGA) (HR: 3.46; 95% CI: 2.35-5.09, p = 3.50 x 10-10), and the GSE96058 (HR: 2.83; 95% CI: 2.20-3.65, p = 6.29 x 10-16) datasets. The 5-year risk of death in the high-risk group was 32.41% for METABRIC, 27.8% for TCGA, and 17.54% for GSE96058 datasets. Consistently, HER2-enriched and triple-negative breast carcinomas (TNBC) cases showed higher AdhesionScores than luminal subtypes, indicating an association with aggressive tumor biology. Conclusions: We have developed, for the first time, a molecular signature based on cell adhesion, as well as an associated AdhesionScore that can predict patient prognosis in invasive breast cancer, with potential clinical application. We developed a novel adhesion-based molecular signature, the AdhesionScore, that robustly predicts prognosis in breast cancer across independent cohorts, highlighting its potential clinical utility for patient risk stratification.

2025

Exploiting Trusted Execution Environments and Distributed Computation for Genomic Association Tests

Authors
Brito C.V.; Ferreira P.G.; Paulo J.T.;

Publication
IEEE Journal of Biomedical and Health Informatics

Abstract
Breakthroughs in sequencing technologies led to an exponential growth of genomic data, providing novel biological insights and therapeutic applications. However, analyzing large amounts of sensitive data raises key data privacy concerns, specifically when the information is outsourced to untrusted third-party infrastructures for data storage and processing (e.g., cloud computing). We introduce Gyosa, a secure and privacy-preserving distributed genomic analysis solution. By leveraging trusted execution environments (TEEs), Gyosa allows users to confidentially delegate their GWAS analysis to untrusted infrastructures. Gyosa implements a computation partitioning scheme that reduces the computation done inside the TEEs while safeguarding the users' genomic data privacy. By integrating this security scheme in Glow, Gyosa provides a secure and distributed environment that facilitates diverse GWAS studies. The experimental evaluation validates the applicability and scalability of Gyosa, reinforcing its ability to provide enhanced security guarantees.

2023

The landscape of expression and alternative splicing variation across human traits

Authors
García Pérez, R; Ramirez, JM; Ripoll Cladellas, A; Chazarra Gil, R; Oliveros, W; Soldatkina, O; Bosio, M; Rognon, PJ; Capella Gutierrez, S; Calvo, M; Reverter, F; Guigó, R; Aguet, F; Ferreira, PG; Ardlie, KG; Melé, M;

Publication
Cell Genomics

Abstract
Understanding the consequences of individual transcriptome variation is fundamental to deciphering human biology and disease. We implement a statistical framework to quantify the contributions of 21 individual traits as drivers of gene expression and alternative splicing variation across 46 human tissues and 781 individuals from the Genotype-Tissue Expression project. We demonstrate that ancestry, sex, age, and BMI make additive and tissue-specific contributions to expression variability, whereas interactions are rare. Variation in splicing is dominated by ancestry and is under genetic control in most tissues, with ribosomal proteins showing a strong enrichment of tissue-shared splicing events. Our analyses reveal a systemic contribution of types 1 and 2 diabetes to tissue transcriptome variation with the strongest signal in the nerve, where histopathology image analysis identifies novel genes related to diabetic neuropathy. Our multi-tissue and multi-trait approach provides an extensive characterization of the main drivers of human transcriptome variation in health and disease. © 2022 The Authors

2024

APAtizer: a tool for alternative polyadenylation analysis of RNA-Seq data

Authors
Sousa, B; Bessa, M; de Mendonca, FL; Ferreira, PG; Moreira, A; Pereira-Castro, I;

Publication
BIOINFORMATICS

Abstract
APAtizer is a tool designed to analyze alternative polyadenylation events on RNA-sequencing data. The tool handles different file formats, including BAM, htseq, and DaPars bedGraph files. It provides a user-friendly interface that allows users to generate informative visualizations, including Volcano plots, heatmaps, and gene lists. These outputs allow the user to retrieve useful biological insights such as the occurrence of polyadenylation events when comparing two biological conditions. In addition, it can perform differential gene expression, gene ontology analysis, visualization of Venn diagram intersections, and correlation analysis.

2024

Integration of multi-modal datasets to estimate human aging

Authors
Ribeiro, R; Moraes, A; Moreno, M; Ferreira, PG;

Publication
MACHINE LEARNING

Abstract
Aging involves complex biological processes leading to the decline of living organisms. As population lifespan increases worldwide, the importance of identifying factors underlying healthy aging has become critical. Integration of multi-modal datasets is a powerful approach for the analysis of complex biological systems, with the potential to uncover novel aging biomarkers. In this study, we leveraged publicly available epigenomic, transcriptomic and telomere length data along with histological images from the Genotype-Tissue Expression project to build tissue-specific regression models for age prediction. Using data from two tissues, lung and ovary, we aimed to compare model performance across data modalities, as well as to assess the improvement resulting from integrating multiple data types. Our results demostrate that methylation outperformed the other data modalities, with a mean absolute error of 3.36 and 4.36 in the test sets for lung and ovary, respectively. These models achieved lower error rates when compared with established state-of-the-art tissue-agnostic methylation models, emphasizing the importance of a tissue-specific approach. Additionally, this work has shown how the application of Hierarchical Image Pyramid Transformers for feature extraction significantly enhances age modeling using histological images. Finally, we evaluated the benefits of integrating multiple data modalities into a single model. Combining methylation data with other data modalities only marginally improved performance likely due to the limited number of available samples. Combining gene expression with histological features yielded more accurate age predictions compared with the individual performance of these data types. Given these results, this study shows how machine learning applications can be extended to/in multi-modal aging research. Code used is available at https://github.com/zroger49/multi_modal_age_prediction.

2025

The molecular impact of cigarette smoking resembles aging across tissues

Authors
Ramirez, JM; Ribeiro, R; Soldatkina, O; Moraes, A; García-Pérez, R; Ferreira, PG; Melé, M;

Publication
GENOME MEDICINE

Abstract
BackgroundTobacco smoke is the main cause of preventable mortality worldwide. Smoking increases the risk of developing many diseases and has been proposed as an aging accelerator. Yet, the molecular mechanisms driving smoking-related health decline and aging acceleration in most tissues remain unexplored.MethodsHere, we use data from the Genotype-Tissue Expression Project (GTEx) to perform a characterization of the effect of cigarette smoking across human tissues. We perform a multi-tissue analysis across 46 human tissues. Our multi-omics characterization includes analysis of gene expression, alternative splicing, DNA methylation, and histological alterations. We further analyze ex-smoker samples to assess the reversibility of these molecular alterations upon smoking cessation.ResultsWe show that smoking impacts tissue architecture and triggers systemic inflammation. We find that in many tissues, the effects of smoking significantly overlap those of aging. Specifically, both age and smoking upregulate inflammatory genes and drive hypomethylation at enhancers (odds ratio (OR) = 2). In addition, we observe widespread smoking-driven hypermethylation at target regions of the Polycomb repressive complex (OR = 2), which is a well-known aging effect. Smoking-induced epigenetic changes overlap causal aging CpGs, suggesting that these methylation changes may directly mediate the aging acceleration observed in smokers. Finally, we find that smoking effects that are shared with aging are more persistent over time.ConclusionOverall, our multi-tissue and multi-omic analysis of the effects of cigarette smoking provides an extensive characterization of the impact of tobacco smoke across tissues and unravels the molecular mechanisms driving smoking-induced tissue homeostasis decline and aging acceleration.

  • 4
  • 14