2024
Autores
Brito, C; Ferreira, P; Paulo, J;
Publicação
Abstract
2024
Autores
Ribeiro, R; Moraes, A; Moreno, M; Ferreira, PG;
Publicação
MACHINE LEARNING
Abstract
Aging involves complex biological processes leading to the decline of living organisms. As population lifespan increases worldwide, the importance of identifying factors underlying healthy aging has become critical. Integration of multi-modal datasets is a powerful approach for the analysis of complex biological systems, with the potential to uncover novel aging biomarkers. In this study, we leveraged publicly available epigenomic, transcriptomic and telomere length data along with histological images from the Genotype-Tissue Expression project to build tissue-specific regression models for age prediction. Using data from two tissues, lung and ovary, we aimed to compare model performance across data modalities, as well as to assess the improvement resulting from integrating multiple data types. Our results demostrate that methylation outperformed the other data modalities, with a mean absolute error of 3.36 and 4.36 in the test sets for lung and ovary, respectively. These models achieved lower error rates when compared with established state-of-the-art tissue-agnostic methylation models, emphasizing the importance of a tissue-specific approach. Additionally, this work has shown how the application of Hierarchical Image Pyramid Transformers for feature extraction significantly enhances age modeling using histological images. Finally, we evaluated the benefits of integrating multiple data modalities into a single model. Combining methylation data with other data modalities only marginally improved performance likely due to the limited number of available samples. Combining gene expression with histological features yielded more accurate age predictions compared with the individual performance of these data types. Given these results, this study shows how machine learning applications can be extended to/in multi-modal aging research. Code used is available at https://github.com/zroger49/multi_modal_age_prediction.
2024
Autores
Ramirez, JM; Ribeiro, R; Soldatkina, O; Moraes, A; García-Pérez, R; Ferreira, PG; Melé, M;
Publicação
Abstract
2024
Autores
Sousa, B; Bessa, M; de Mendonca, FL; Ferreira, PG; Moreira, A; Pereira-Castro, I;
Publicação
BIOINFORMATICS
Abstract
APAtizer is a tool designed to analyze alternative polyadenylation events on RNA-sequencing data. The tool handles different file formats, including BAM, htseq, and DaPars bedGraph files. It provides a user-friendly interface that allows users to generate informative visualizations, including Volcano plots, heatmaps, and gene lists. These outputs allow the user to retrieve useful biological insights such as the occurrence of polyadenylation events when comparing two biological conditions. In addition, it can perform differential gene expression, gene ontology analysis, visualization of Venn diagram intersections, and correlation analysis.
2023
Autores
García Pérez, R; Ramirez, JM; Ripoll Cladellas, A; Chazarra Gil, R; Oliveros, W; Soldatkina, O; Bosio, M; Rognon, PJ; Capella Gutierrez, S; Calvo, M; Reverter, F; Guigó, R; Aguet, F; Ferreira, PG; Ardlie, KG; Melé, M;
Publicação
Cell Genomics
Abstract
Understanding the consequences of individual transcriptome variation is fundamental to deciphering human biology and disease. We implement a statistical framework to quantify the contributions of 21 individual traits as drivers of gene expression and alternative splicing variation across 46 human tissues and 781 individuals from the Genotype-Tissue Expression project. We demonstrate that ancestry, sex, age, and BMI make additive and tissue-specific contributions to expression variability, whereas interactions are rare. Variation in splicing is dominated by ancestry and is under genetic control in most tissues, with ribosomal proteins showing a strong enrichment of tissue-shared splicing events. Our analyses reveal a systemic contribution of types 1 and 2 diabetes to tissue transcriptome variation with the strongest signal in the nerve, where histopathology image analysis identifies novel genes related to diabetic neuropathy. Our multi-tissue and multi-trait approach provides an extensive characterization of the main drivers of human transcriptome variation in health and disease. © 2022 The Authors
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.