Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Ricardo Gonçalves Macedo

2017

A Practical Framework for Privacy-Preserving NoSQL Databases

Authors
Macedo, R; Paulo, J; Pontes, R; Portela, B; Oliveira, T; Matos, M; Oliveira, R;

Publication
2017 IEEE 36TH INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS (SRDS)

Abstract
Cloud infrastructures provide database services as cost-efficient and scalable solutions for storing and processing large amounts of data. To maximize performance, these services require users to trust sensitive information to the cloud provider, which raises privacy and legal concerns. This represents a major obstacle to the adoption of the cloud computing paradigm. Recent work addressed this issue by extending databases to compute over encrypted data. However, these approaches usually support a single and strict combination of cryptographic techniques invariably making them application specific. To assess and broaden the applicability of cryptographic techniques in secure cloud storage and processing, these techniques need to be thoroughly evaluated in a modular and configurable database environment. This is even more noticeable for NoSQL data stores where data privacy is still mostly overlooked. In this paper, we present a generic NoSQL framework and a set of libraries supporting data processing cryptographic techniques that can be used with existing NoSQL engines and composed to meet the privacy and performance requirements of different applications. This is achieved through a modular and extensible design that enables data processing over multiple cryptographic techniques applied on the same database. For each technique, we provide an overview of its security model, along with an extensive set of experiments. The framework is evaluated with the YCSB benchmark, where we assess the practicality and performance tradeoffs for different combinations of cryptographic techniques. The results for a set of macro experiments show that the average overhead in NoSQL operations performance is below 15%, when comparing our system with a baseline database without privacy guarantees.

2019

TRUSTFS: An SGX-enabled Stackable File System Framework

Authors
Esteves, T; Macedo, R; Faria, A; Portela, B; Paulo, J; Pereira, J; Harnik, D;

Publication
2019 38TH INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS WORKSHOPS (SRDSW 2019)

Abstract
Data confidentiality in cloud services is commonly ensured by encrypting information before uploading it. However, this approach limits the use of content-aware functionalities, such as deduplication and compression. Although this issue has been addressed individually for some of these functionalities, no unified framework for building secure storage systems exists that can leverage such operations over encrypted data. We present TRUSTFS, a programmable and modular stackable file system framework for implementing secure content-aware storage functionalities over hardware-assisted trusted execution environments. This framework extends the original SAFEFS architecture to provide the isolated execution guarantees of Intel SGX. We demonstrate its usability by implementing an SGX-enabled stackable file system prototype while a preliminary evaluation shows that it incurs reasonable performance overhead when compared to conventional storage systems. Finally, we highlight open research challenges that must be further pursued in order for TRUSTFS to be fully adequate for building production-ready secure storage solutions.

2019

A Case for Dynamically Programmable Storage Background Tasks

Authors
Macedo, R; Faria, A; Paulo, J; Pereira, J;

Publication
2019 38TH INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS WORKSHOPS (SRDSW 2019)

Abstract
Modern storage infrastructures feature long and complicated I/O paths composed of several layers, each employing their own optimizations to serve varied applications with fluctuating requirements. However, as these layers do not have global infrastructure visibility, they are unable to optimally tune their behavior to achieve maximum performance. Background storage tasks, in particular, can rapidly overload shared resources, but are executed either periodically or whenever a certain threshold is hit regardless of the overall load on the system. In this paper, we argue that to achieve optimal holistic performance, these tasks should be dynamically programmable and handled by a controller with global visibility. To support this argument, we evaluate the impact on performance of compaction and checkpointing in the context of HBase and PostgreSQL. We find that these tasks can respectively increase 99th percentile latencies by 955.2% and 61.9%. We also identify future research directions to achieve programmable background tasks.

2020

A Survey and Classification of Software-Defined Storage Systems

Authors
Macedo, R; Paulo, J; Pereira, J; Bessani, A;

Publication
ACM COMPUTING SURVEYS

Abstract
The exponential growth of digital information is imposing increasing scale and efficiency demands on modern storage infrastructures. As infrastructure complexity increases, so does the difficulty in ensuring quality of service, maintainability, and resource fairness, raising unprecedented performance, scalability, and programmability challenges. Software-Defined Storage (SDS) addresses these challenges by cleanly disentangling control and data flows, easing management, and improving control functionality of conventional storage systems. Despite its momentum in the research community, many aspects of the paradigm are still unclear, undefined, and unexplored, leading to misunderstandings that hamper the research and development of novel SDS technologies. In this article, we present an in-depth study of SDS systems, providing a thorough description and categorization of each plane of functionality. Further, we propose a taxonomy and classification of existing SDS solutions according to different criteria. Finally, we provide key insights about the paradigm and discuss potential future research directions for the field.

2021

BDUS: implementing block devices in user space

Authors
Faria, A; Macedo, R; Pereira, J; Paulo, J;

Publication
SYSTOR '21: The 14th ACM International Systems and Storage Conference, Haifa, Israel, June 14-16, 2021.

Abstract

2021

MONARCH: Hierarchical Storage Management for Deep Learning Frameworks

Authors
Dantas, M; Leitao, D; Correia, C; Macedo, R; Xu, WJ; Paulo, J;

Publication
2021 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2021)

Abstract
Due to convenience and usability, many deep learning (DL) jobs resort to the available shared parallel file system (PFS) for storing and accessing training data when running in HPC environments. Under such a scenario, however, where multiple I/O-intensive applications operate concurrently, the PFS can quickly get saturated with simultaneous storage requests and become a critical performance bottleneck, leading to throughput variability and performance loss. We present MONARCH, a framework-agnostic middleware for hierarchical storage management. This solution leverages the existing storage tiers present at modern supercomputers (e.g., compute node's local storage, PFS) to improve DL training performance and alleviate the current I/O pressure of the shared PFS. We validate the applicability of our approach by developing and integrating an early prototype with the TensorFlow DL framework. Results show that MONARCH can reduce I/O operations submitted to the shared PFS by up to 45%, decreasing training time by 24% and 12%, for I/O-intensive models, namely LeNet and AlexNet.

  • 1
  • 3