Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
About

About

My research interests cover data management in eventual consistent settings, distributed data aggregation and causality tracking. In the last years I have collaborated with my co-authors in the development of data summary mechanisms such as Scalable Bloom Filters, causality tracking for dynamic settings with Interval Tree Clocks and Dotted Version Vectors and in predictable eventual consistency with Conflict-Free Replicated Data Types. My recent work has been applied in the Riak distributed database and in Akka distributed data, and is running in production systems serving millions of users worldwide.

Interest
Topics
Details

Details

  • Name

    Carlos Baquero
  • Role

    Area Manager
  • Since

    01st November 2011
003
Publications

2026

ConflictSync: Bandwidth Efficient Synchronization of Divergent State

Authors
Baquero, C; Gomes, PS; Rodrigues, MB;

Publication
PaPoC@EuroSys

Abstract
State-based Conflict-Free Replicated Data Types (CRDTs) are widely used in distributed systems to ensure high availability without coordination. However, their naive synchronization strategy, transmitting the full state, incurs high communication costs. In this paper, we: (1) propose ConflictSync, a digest-driven synchronization algorithm, which reduces total data transfer by up to 18× compared to full-state transmissions; (2) formulate state-based CRDT synchronization as set reconciliation over irredundant join decompositions; (3) generalize Rateless Set Reconciliation for variable-sized elements, at the cost of an additional communication step; (4) introduce a new generic set reconciliation solution, integrating Bloom Filters with rateless IBLTs; (5) experimentally evaluate the novel synchronization strategies. © 2026 Copyright held by the owner/author(s).

2026

Bounding Byzantine Impact in Open CRDT Systems

Authors
Baquero, C; Maia, F; Dantas, A; Anta, AF; Frey, D; Sánchez, C; Albouy, T;

Publication
PaPoC@EuroSys

Abstract
Conflict-free Replicated Data Types (CRDTs) enable available and eventually consistent data replication without coordination, making them well suited for open and partition-prone environments. Recent work has shown that CRDTs can be extended to tolerate Byzantine faults by ensuring that replicas eventually agree on the validity of operations, even in permis-sionless settings. However, validity alone does not prevent a Byzantine participant from inflicting unbounded damage by issuing large volumes of adversarial yet well-formed updates. For example, when editing text, an attacker can easily delete prior text. In this paper, we study how to bound the impact of Byzantine behavior in open CRDT systems. We introduce bounded Byzantine CRDTs, a rate-limiting framework for CRDTs in which each update carries an associated cost that limits the influence of adversarial operations relative to the resources they expend. Overall, this work bridges the gap between Byzantine-Tolerant CRDTs and resource-bounded adversarial models, providing a principled foundation for deploying CRDTs in fully open, adversarial environments. © 2026 Copyright held by the owner/author(s).

2025

ConflictSync: Bandwidth Efficient Synchronization of Divergent State

Authors
Gomes, PS; Rodrigues, MB; Baquero, C;

Publication
CoRR

Abstract

2025

Distributed Generalized Linear Models: A Privacy-Preserving Approach

Authors
Tinoco, D; Menezes, R; Baquero, C;

Publication
COMPUTATIONAL STATISTICS

Abstract
This paper presents a novel approach to classical linear regression, enabling accurate model computation from data streams or in a distributed setting while preserving data privacy in federated environments. We extend this framework to generalized linear models (GLMs), ensuring scalability and adaptability to diverse data distributions while maintaining privacy-preserving properties. To assess the effectiveness of our approach, we conduct numerical studies on both simulated and real datasets, comparing our method with conventional maximum likelihood estimation for GLMs using iteratively reweighted least squares. Our results demonstrate the advantages of the proposed method in distributed and federated settings.

2025

CRDT-Based Game State Synchronization in Peer-to-Peer VR

Authors
Dantas, A; Baquero, C;

Publication
PROCEEDINGS OF THE 12TH WORKSHOP ON PRINCIPLES AND PRACTICE OF CONSISTENCY FOR DISTRIBUTED DATA, PAPOC 2025

Abstract
Virtual presence demands ultra-low latency, a factor that centralized architectures, by their nature, cannot minimize. Local peer-to-peer architectures offer a compelling alternative, but also pose unique challenges in terms of network infrastructure. This paper introduces a prototype leveraging Conflict-Free Replicated Data Types (CRDTs) to enable real-time collaboration in a shared virtual environment. Using this prototype, we investigate latency, synchronization, and the challenges of decentralized coordination in dynamic non-Byzantine contexts. We aim to question prevailing assumptions about decentralized architectures and explore the practical potential of P2P in advancing virtual presence. This work challenges the constraints of mediated networks and highlights the potential of decentralized architectures to redefine collaboration and interaction in digital spaces.