Publications

Publications by Carlos Baquero

2014

Making Operation-Based CRDTs Operation-Based

Authors
Baquero, C; Almeida, PS; Shoker, A;

Publication
DISTRIBUTED APPLICATIONS AND INTEROPERABLE SYSTEMS (DAIS 2014)

Abstract
Conflict-free Replicated Datatypes (CRDT) are usually classified as either state-based or operation-based. However, the standard definition of op-based CRDTs is very encompassing, allowing even sending the full-state, blurring the distinction. We introduce pure op-based CRDTs, that can only send operations to other replicas, drawing a clear distinction from state-based ones. Datatypes with commutative operations can be trivially implemented as pure op-based CRDTs using standard reliable causal delivery. We propose an extended API - tagged reliable causal broadcast - that provides causality information upon delivery, and show how it can be used to also implement other datatypes having non-commutative operations, through the use of a PO-Log - a partially ordered log of operations - inside the datatype. A semanticallybased PO-Log compaction framework, using both causality and what we denote by causal stability, allows obtaining very compact replica state for pure op-based CRDTs, while also benefiting from small message sizes.

CloseRead Abstract

2017

As Secure as Possible Eventual Consistency

Authors
Shoker, A; Yactine, H; Baquero, C;

Publication
PROCEEDINGS OF THE 3RD INTERNATIONAL WORKSHOP ON PRINCIPLES AND PRACTICE OF CONSISTENCY FOR DISTRIBUTED DATA (PAPOC 17)

Abstract
Eventual consistency (EC) is a relaxed data consistency model that, driven by the CAP theorem, trades prompt consistency for high availability. Although, this model has shown to be promising and greatly adopted by industry, the state of the art only assumes that replicas can crash and recover. However, a Byzantine replica (i.e., arbitrary or malicious) can hamper the eventual convergence of replicas to a global consistent state, thus compromising the entire service. Classical BFT state machine replication protocols cannot solve this problem due to the blocking nature of consensus, something at odd with the availability via replica divergence in the EC model. In this work in progress paper, we introduce a new secure highly available protocol for the EC model that assumes a fraction of replicas and any client can be Byzantine. To respect the essence of EC, the protocol gives priority to high availability, and thus Byzantine detection is performed off the critical path on a consistent data offset. The paper concisely explains the protocol and discusses its feasibility. We aim at presenting a more comprehensive and empirical study in the future.

CloseRead Abstract

2017

DottedDB: Anti-Entropy without Merkle Trees, Deletes without Tombstones

Authors
Goncalves, R; Almeida, PS; Baquero, C; Fonte, V;

Publication
2017 IEEE 36TH INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS (SRDS)

Abstract
To achieve high availability in the face of network partitions, many distributed databases adopt eventual consistency, allow temporary conflicts due to concurrent writes, and use some form of per-key logical clock to detect and resolve such conflicts. Furthermore, nodes synchronize periodically to ensure replica convergence in a process called anti-entropy, normally using Merkle Trees. We present the design of Dotted-DB, a Dynamo-like key-value store, which uses a novel node-wide logical clock framework, overcoming three fundamental limitations of the state of the art: (1) minimize the metadata per key necessary to track causality, avoiding its growth even in the face of node churn; (2) correctly and durably delete keys, with no need for tombstones; (3) offer a lightweight antientropy mechanism to converge replicated data, avoiding the need for Merkle Trees. We evaluate DottedDB against MerkleDB, an otherwise identical database, but using per-key logical clocks and Merkle Trees for anti-entropy, to precisely measure the impact of the novel approach. Results show that: causality metadata per object always converges rapidly to only one id-counter pair; distributed deletes are correctly achieved without global coordination and with constant metadata; divergent nodes are synchronized faster, with less memory-footprint and with less communication overhead than using Merkle Trees.

CloseRead Abstract

2016

Life Beyond Distributed Transactions on the Edge

Authors
Shoker, A; Kassam, Z; Almeida, PS; Baquero, C;

Publication
Proceedings of the 1st Workshop on Middleware for Edge Clouds & Cloudlets, Trento, Italy, December 12-16, 2016

Abstract
Edge/Fog Computing is an extension to the Cloud Computing model, primarily proposed to pull some of the load on cloud data center towards the edge of the network, i.e., closer to the clients. Despite being a promising model, the foundations to adopt and fully exploit the edge model are yet to be clear, and thus new ideas are continuously advocated. In his paper on \Life beyond Distributed Transactions: An Apostate's Opinion", Pat Helland proposed his vision to build\almost innite" scale future applications, demonstrating why Distributed Transactions are not very practical under scale. His approach models the applications data state as independent \entities" with separate serialization scopes, thus allowing ecient local transactions within an entity, but precluding transactions involving dierent entities. Accessing remote data (which is assumed rare) can be done through separate channels in a more message-oriented manner. In this paper, we recall Helland's vision in the aforementioned paper, explaining how his model ts the Edge Computing Model either regarding scalability, applications, or assumptions, and discussing the potential challenges leveraged . © 2016 ACM.

CloseRead Abstract

2016

Why Logical Clocks Are Easy

Authors
Baquero, C; Preguica, N;

Publication
COMMUNICATIONS OF THE ACM

Abstract

2017

Aggregation Protocols in Light of Reliable Communication

Authors
Kassam, Z; Shoker, A; Almeida, PS; Baquero, C;

Publication
2017 IEEE 16TH INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (NCA)

Abstract
Aggregation protocols allow for distributed lightweight computations deployed on ad-hoc networks in a peer-to-peer fashion. Due to reliance on wireless technology, the communication medium is often hostile which makes such protocols susceptible to correctness and performance issues. In this paper, we study the behavior of aggregation protocols when subject to communication failures: message loss, duplication, and network partitions. We show that resolving communication failures at the communication layer, through a simple reliable communication layer, reduces the overhead of using alternative fault tolerance techniques at upper layers, and also preserves the original accuracy and simplicity of protocols. The empirical study we drive shows that tradeoffs exist across various aggregation protocols, and there is no one-size-fits-all protocol.

CloseRead Abstract