Publicacoes - INESC TEC

Publicações

Publicações por Paulo Sérgio Almeida

2016

Life Beyond Distributed Transactions on the Edge

Autores
Shoker, A; Kassam, Z; Almeida, PS; Baquero, C;

Publicação
Proceedings of the 1st Workshop on Middleware for Edge Clouds & Cloudlets, Trento, Italy, December 12-16, 2016

Abstract
Edge/Fog Computing is an extension to the Cloud Computing model, primarily proposed to pull some of the load on cloud data center towards the edge of the network, i.e., closer to the clients. Despite being a promising model, the foundations to adopt and fully exploit the edge model are yet to be clear, and thus new ideas are continuously advocated. In his paper on \Life beyond Distributed Transactions: An Apostate's Opinion", Pat Helland proposed his vision to build\almost innite" scale future applications, demonstrating why Distributed Transactions are not very practical under scale. His approach models the applications data state as independent \entities" with separate serialization scopes, thus allowing ecient local transactions within an entity, but precluding transactions involving dierent entities. Accessing remote data (which is assumed rare) can be done through separate channels in a more message-oriented manner. In this paper, we recall Helland's vision in the aforementioned paper, explaining how his model ts the Edge Computing Model either regarding scalability, applications, or assumptions, and discussing the potential challenges leveraged . © 2016 ACM.

FecharLer Abstract

2017

Aggregation Protocols in Light of Reliable Communication

Autores
Kassam, Z; Shoker, A; Almeida, PS; Baquero, C;

Publicação
2017 IEEE 16TH INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (NCA)

Abstract
Aggregation protocols allow for distributed lightweight computations deployed on ad-hoc networks in a peer-to-peer fashion. Due to reliance on wireless technology, the communication medium is often hostile which makes such protocols susceptible to correctness and performance issues. In this paper, we study the behavior of aggregation protocols when subject to communication failures: message loss, duplication, and network partitions. We show that resolving communication failures at the communication layer, through a simple reliable communication layer, reduces the overhead of using alternative fault tolerance techniques at upper layers, and also preserves the original accuracy and simplicity of protocols. The empirical study we drive shows that tradeoffs exist across various aggregation protocols, and there is no one-size-fits-all protocol.

FecharLer Abstract

2014

Scalable and Accurate Causality Tracking for Eventually Consistent Stores

Autores
Sergio Almeida, PS; Baquero, C; Goncalves, R; Preguica, N; Fonte, V;

Publicação
DISTRIBUTED APPLICATIONS AND INTEROPERABLE SYSTEMS (DAIS 2014)

Abstract
In cloud computing environments, data storage systems often rely on optimistic replication to provide good performance and availability even in the presence of failures or network partitions. In this scenario, it is important to be able to accurately and efficiently identify updates executed concurrently. Current approaches to causality tracking in optimistic replication have problems with concurrent updates: they either (1) do not scale, as they require replicas to maintain information that grows linearly with the number of writes or unique clients; (2) lose information about causality, either by removing entries from client-id based version vectors or using server-id based version vectors, which cause false conflicts. We propose a new logical clock mechanism and a logical clock framework that together support a traditional key-value store API, while capturing causality in an accurate and scalable way, avoiding false conflicts. It maintains concise information per data replica, only linear on the number of replica servers, and allows data replicas to be compared and merged linear with the number of replica servers and versions.

FecharLer Abstract

2015

A Survey of Distributed Data Aggregation Algorithms

Autores
Jesus, P; Baquero, C; Almeida, PS;

Publicação
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS

Abstract
Distributed data aggregation is an important task, allowing the decentralized determination of meaningful global properties, which can then be used to direct the execution of other applications. The resulting values are derived by the distributed computation of functions like COUNT, SUM, and AVERAGE. Some application examples deal with the determination of the network size, total storage capacity, average load, majorities and many others. In the last decade, many different approaches have been proposed, with different trade-offs in terms of accuracy, reliability, message and time complexity. Due to the considerable amount and variety of aggregation algorithms, it can be difficult and time consuming to determine which techniques will be more appropriate to use in specific settings, justifying the existence of a survey to aid in this task. This work reviews the state of the art on distributed data aggregation algorithms, providing three main contributions. First, it formally defines the concept of aggregation, characterizing the different types of aggregation functions. Second, it succinctly describes the main aggregation techniques, organizing them in a taxonomy. Finally, it provides some guidelines toward the selection and use of the most relevant techniques, summarizing their principal characteristics.

FecharLer Abstract

2017

The Single-Writer Principle in CRDT Composition

Autores
Enes, V; Almeida, PS; Baquero, C;

Publicação
Proceedings of the Workshop on Programming Models and Languages for Distributed Computing, Barcelona, Spain, June 20, 2017

Abstract
Multi-master replication in a distributed system setting allows each node holding a replica to update and query the local replica, and disseminate updates to other nodes. Obtaining high availability typically entails allowing replicas to diverge and requires a background mechanism for re-establishing consistency. Conflict-free Replicated Data Types (CRDTs) extend standard sequential data-Types with appropriate merge functions, and often can be composed together to create more complex ones. In this work we add a generic CRDT composition approach that explores the single-writer principle. By carefully controlling which part of the composition can be updated by each replica, we can derive efficient designs that cover new usecases. After introducing the new construction we exemplify some uses, including how to emulate a simple Doodle functionality for selecting a common meeting schedule among different participants. © 2017 Association for Computing Machinery.

FecharLer Abstract

2016

The problem with embedded CRDT counters and a solution

Autores
Baquero, C; Almeida, PS; Lerche, C;

Publicação
PROCEEDINGS OF THE 2ND WORKSHOP ON THE PRINCIPLES AND PRACTICE OF CONSISTENCY FOR DISTRIBUTED DATA, PAPOC 2016

Abstract
Conflict-free Replicated Data Types (CRDTs) can simplify the design of deterministic eventual consistency. Considering the several CRDTs that have been deployed in production systems, counters are among the first. Counters are apparently simple, with a straightforward inc/dec/read API, but can require complex implementations and several variants have been specified and coded. Unlike sets and registers, that can be adapted to operate inside maps, current counter approaches exhibit anomalies when embedded in maps. Here, we illustrate the anomaly and propose a solution, based on a new counter model and implementation.

FecharLer Abstract