Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by João Tiago Paulo

2025

Exploiting Trusted Execution Environments and Distributed Computation for Genomic Association Tests

Authors
Brito C.V.; Ferreira P.G.; Paulo J.T.;

Publication
IEEE Journal of Biomedical and Health Informatics

Abstract
Breakthroughs in sequencing technologies led to an exponential growth of genomic data, providing novel biological insights and therapeutic applications. However, analyzing large amounts of sensitive data raises key data privacy concerns, specifically when the information is outsourced to untrusted third-party infrastructures for data storage and processing (e.g., cloud computing). We introduce Gyosa, a secure and privacy-preserving distributed genomic analysis solution. By leveraging trusted execution environments (TEEs), Gyosa allows users to confidentially delegate their GWAS analysis to untrusted infrastructures. Gyosa implements a computation partitioning scheme that reduces the computation done inside the TEEs while safeguarding the users' genomic data privacy. By integrating this security scheme in Glow, Gyosa provides a secure and distributed environment that facilitates diverse GWAS studies. The experimental evaluation validates the applicability and scalability of Gyosa, reinforcing its ability to provide enhanced security guarantees.

2024

Can Current SDS Controllers Scale To Modern HPC Infrastructures?

Authors
Miranda, M; Tanimura, Y; Haga, J; Ruhela, A; Harrell, SL; Cazes, J; Macedo, R; Pereira, J; Paulo, J;

Publication
PROCEEDINGS OF SC24-W: WORKSHOPS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS

Abstract
Modern supercomputers host numerous jobs that compete for shared storage resources, causing I/O interference and performance degradation. Solutions based on software-defined storage (SDS) emerged to address this issue by coordinating the storage environment through the enforcement of QoS policies. However, these often fail to consider the scale of modern HPC infrastructures. In this work, we explore the advantages and shortcomings of state-of-the-art SDS solutions and highlight the scale of current production clusters and their rising trends. Furthermore, we conduct the first experimental study that sheds new insights into the performance and scalability of flat and hierarchical SDS control plane designs. Our results, using the Frontera supercomputer, show that a flat design with a single controller can scale up to 2,500 nodes with an average control cycle latency of 41 ms, while hierarchical designs can handle up to 10,000 nodes with an average latency ranging between 69 and 103 ms.

2025

Promoting sustainable and personalized travel behaviors while preserving data privacy

Authors
Brito C.; Pina N.; Esteves T.; Vitorino R.; Cunha I.; Paulo J.;

Publication
Transportation Engineering

Abstract
Cities worldwide have agreed on ambitious goals regarding carbon neutrality. To do so, policymakers seek ways to foster smarter and cleaner transportation solutions. However, citizens lack awareness of their carbon footprint and of greener mobility alternatives such as public transports. With this, three main challenges emerge: (i) increase users’ awareness regarding their carbon footprint, (ii) provide personalized recommendations and incentives for using sustainable transportation alternatives and, (iii) guarantee that any personal data collected from the user is kept private. This paper addresses these challenges by proposing a new methodology. Created under the FranchetAI project, the methodology combines federated Artificial Intelligence (AI) and Greenhouse Gas (GHG) estimation models to calculate the carbon footprint of users when choosing different transportation modes (e.g., foot, car, bus). Through a mobile application that keeps the privacy of users’ personal information, the project aims at providing detailed reports to inform citizens about their impact on the environment, and an incentive program to promote the usage of more sustainable mobility alternatives.

2024

When Amnesia Strikes: Understanding and Reproducing Data Loss Bugs with Fault Injection

Authors
Ramos, M; Azevedo, J; Kingsbury, K; Pereira, J; Esteves, T; Macedo, R; Paulo, J;

Publication
PROCEEDINGS OF THE VLDB ENDOWMENT

Abstract
We present LAZYFS, a new fault injection tool that simplifies the debugging and reproduction of complex data durability bugs experienced by databases, key-value stores, and other data-centric systems in crashes. Our tool simulates persistence properties of POSIX file systems (e.g., operations ordering and atomicity) and enables users to inject lost and torn write faults with a precise and controlled approach. Further, it provides profiling information about the system's operations flow and persisted data, enabling users to better understand the root cause of errors. We use LAZYFS to study seven important systems: PostgreSQL, etcd, Zookeeper, Redis, LevelDB, PebblesDB, and Lightning Network. Our fault injection campaign shows that LAZYFS automates and facilitates the reproduction of five known bug reports containing manual and complex reproducibility steps. Further, it aids in understanding and reproducing seven ambiguous bugs reported by users. Finally, LAZYFS is used to find eight new bugs, which lead to data loss, corruption, and unavailability.

2023

CRIBA: A Tool for Comprehensive Analysis of Cryptographic Ransomware's I/O Behavior

Authors
Esteves, T; Pereira, B; Oliveira, RP; Marco, J; Paulo, J;

Publication
2023 42ND INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, SRDS 2023

Abstract
Cryptographic ransomware attacks are constantly evolving by obfuscating their distinctive features (e.g., I/O patterns) to bypass detection mechanisms and to run unnoticed at infected servers. Thus, efficiently exploring the I/O behavior of ransomware families is crucial so that security analysts and engineers can better understand these and, with such knowledge, enhance existing detection methods. In this paper, we propose CRIBA, an open-source framework that simplifies the exploration, analysis, and comparison of I/O patterns for Linux cryptographic ransomware. Our solution combines the collection of comprehensive information about system calls issued by ransomware samples, with a customizable and automated analysis and visualization pipeline, including tailored correlation algorithms and visualizations. Our study, including 5 Linux ransomware families, shows that CRIBA provides comprehensive insights about the I/O patterns of these attacks while aiding in exploring common and differentiating traits across families.

2015

Dependable decentralized storage management for cloud computing

Authors
Paulo, J;

Publication

Abstract

  • 6
  • 8