Ricardo Gonçalves Macedo

Cookies Policy

The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More

Institution
Research
Research Domains
Artificial Intelligence

Bioengineering

Communications

Computer Science and Engineering

Photonics

Power and Energy Systems

Robotics

Systems Engineering and Management
RESEARCH CENTERS
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Innovation
Innovation / Tec4

TEC4AGRO-FOOD

TEC4ENERGY

TEC4HEALTH

TEC4INDUSTRY

TEC4SEA

TECPARTNERSHIPS

Available Technologies
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Laboratories
Research Laboratories

iilab
Communication
News

Events

Media

Newsletter
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Work with us
Contacts

Home
People
Ricardo Gonçalves Macedo

Read Full presentation

Ricardo Macedo is currently a Researcher at INESC TEC. He obtained is PhD degree in 2023 under the MAP-i Doctoral Programme in Computer Science from the Universities of Minho, Aveiro and Porto with the thesis “User-level Software-Defined Storage Data Planes”. His research is mainly focused on storage and operating systems, with an emphasis on designing new building blocks fitted for the performance, reliability, and energy consumption requirements of modern, large-scale I/O infrastructures, including key-value stores, kernel-bypass storage stacks, and disaggregated I/O resources. For more information, please check my personal web page at https://rgmacedo.github.io/.

Read Full presentation

About

His research is mainly focused on storage and operating systems, with an emphasis on designing new building blocks fitted for the performance, reliability, and energy consumption requirements of modern, large-scale I/O infrastructures, including key-value stores, kernel-bypass storage stacks, and disaggregated I/O resources. For more information, please check my personal web page at https://rgmacedo.github.io/.

Interest
Topics

Details

Name
Ricardo Gonçalves Macedo
Role
Assistant Researcher
Since
01st December 2016

Nationality
Portugal
Centre
High-Assurance Software
Contacts
+351253604440
ricardo.g.macedo@inesctec.pt

004

Publications

View all Publications

2026

MinatoLoader: Accelerating Machine Learning Training Through Efficient Data Preprocessing

Authors
Nouaji, R; Bitchebe, S; Macedo, R; Balmau, O;

Publication
EuroSys

Abstract
Machine learning (ML) frameworks, such as PyTorch and TensorFlow, rely on data loaders to preprocess data before feeding it to accelerators. When preprocessing is inefficiently pipelined, GPUs can remain idle over long periods of time, leading to substantial training delays. For example, PyTorch’s default data loaders can cause up to 76% GPU idleness. A key bottleneck is the variability in preprocessing time across samples within the same dataset. Existing data loaders are oblivious to this variability, training all samples uniformly. In this case, a single slow sample can stall the entire batch, causing head-of-line blocking. We present MinatoLoader, a general-purpose data loader for PyTorch that accelerates training and improves GPU utilization under single-server, multi-GPU settings. It continuously prepares data in background and constructs batches by prioritizing fast-to-process samples, while slower samples are processed in parallel. Experiments conducted over NVIDIA V100 and A100 GPUs show that MinatoLoader accelerates training by up to 7.5× (3.6× on average) over PyTorch DataLoader and Pecan, and up to 3× (2.2× on average) over DALI. It also increases average GPU utilization from 46% with PyTorch to 90%, while preserving model accuracy and enabling faster convergence.

CloseRead Abstract

2026

Holpaca: Holistic and Adaptable Cache Management for Shared Environments

Authors
Peixoto, JP; González, A; Bhimani, J; Rangaswami, R; Brito, C; Paulo, J; Macedo, R;

Publication
ICPE

Abstract
Modern data-intensive systems rely on in-memory caching to achieve high throughput and low latency. CacheLib, Meta's general-purpose caching engine, provides high performance and flexibility for building specialized caches for a variety of applications. However, despite its wide adoption in large-scale infrastructures, CacheLib's data management mechanisms exhibit inefficiencies in shared environments. Particularly, its static and uncoordinated memory allocation leads to fragmented resource usage, unfair memory distribution, and degraded performance across tenants and instances. We present Holpaca, a general-purpose caching middleware that enables holistic and adaptable orchestration of shared caching environments. Holpaca introduces a shim data layer co-located with each cache instance and a centralized orchestrator with system-wide visibility, enabling global memory management and per-tenant QoS policies. Using production traces from Twitter, results show that, by continuously readjusting memory allocations based on workload dynamics, Holpaca achieves up to 3 higher throughput in multi-tenant and 2.2× improvement in multi-instance settings over CacheLib's rigid built-in mechanisms. © 2026 Owner/Author.

CloseRead Abstract

2026

Idiosyncrasies of Programmable Caching Engines

Authors
Peixoto, JP; González, A; Bhimani, J; Rangaswami, R; Brito, C; Paulo, J; Macedo, R;

Publication
CoRR

Abstract

2025

Keigo: Co-designing Log-Structured Merge Key-Value Stores with a Non-Volatile, Concurrency-aware Storage Hierarchy (Extended Version)

Authors
Adão, R; Wu, Z; Zhou, C; Balmau, O; Paulo, J; Macedo, R;

Publication
CoRR

Abstract

2025

KEIGO: Co-designing Log-Structured Merge Key-Value Stores with a Non-Volatile, Concurrency-aware Storage Hierarchy

Authors
Adao, R; Wu, ZJ; Zhou, CJ; Balmau, O; Paulo, J; Macedo, R;

Publication
PROCEEDINGS OF THE VLDB ENDOWMENT

Abstract
We present Keigo, a concurrency-and workload-aware storage middleware that enhances the performance of log-structured merge key-value stores (LSM KVS) when they are deployed on a hierarchy of storage devices. The key observation behind Keigo is that there is no one-size-fits-all placement of data across the storage hierarchy that optimizes for all workloads. Hence, to leverage the benefits of combining different storage devices, Keigo places files across different devices based on their parallelism, I/O bandwidth, and capacity. We introduce three techniques-concurrency-aware data placement, persistent read-only caching, and context-based I/O differentiation. Keigo is portable across different LSMs, is adaptable to dynamic workloads, and does not require extensive profiling. Our system enables established production KVS such as RocksDB, LevelDB, and Speedb to benefit from heterogeneous storage setups. We evaluate Keigo using synthetic and realistic workloads, showing that it improves the throughput of production-grade LSMs up to 4x for write-and 18x for read-heavy workloads when compared to general-purpose storage systems and specialized LSM KVS.

CloseRead Abstract

Ricardo Gonçalves Macedo

About

Details

Name

Role

Since

Nationality

Centre

Contacts

CDMS

ATE

EuroCC2

BCDSM

MinatoLoader: Accelerating Machine Learning Training Through Efficient Data Preprocessing

Holpaca: Holistic and Adaptable Cache Management for Shared Environments

Idiosyncrasies of Programmable Caching Engines

Keigo: Co-designing Log-Structured Merge Key-Value Stores with a Non-Volatile, Concurrency-aware Storage Hierarchy (Extended Version)

KEIGO: Co-designing Log-Structured Merge Key-Value Stores with a Non-Volatile, Concurrency-aware Storage Hierarchy