Publicacoes - INESC TEC

Publicações

Publicações por HASLab

2015

Exactly-Once Quantity Transfer

Autores
Shoker, A; Almeida, PS; Baquero, C;

Publicação
2015 IEEE 34th Symposium on Reliable Distributed Systems Workshop (SRDSW)

Abstract
Strongly consistent systems supporting distributed transactions can be prone to high latency and do not tolerate partitions. The present trend of using weaker forms of consistency, to achieve high availability, poses notable challenges in writing applications due to the lack of linearizability, e.g., to ensure global invariants, or perform mutator operations on a distributed datatype. This paper addresses a specific problem: the exactly-once transfer of a "quantity" from one node to another on an unreliable network (coping with message duplication, loss, or reordering) and without any form of global synchronization. This allows preserving a global property (the sum of quantities remains unchanged) without requiring global linearizability and only through using pairwise interactions between nodes, therefore allowing partitions in the system. We present the novel quantity-transfer algorithm while focusing on a specific use-case: a redistribution protocol to keep the quantities in a set of nodes balanced; in particular, averaging a shared real number across nodes. Since this is a work in progress, we briefly discuss the correctness of the protocol, and we leave potential extensions and empirical evaluations for future work.

FecharLer Abstract

2015

Flow updating: Fault-tolerant aggregation for dynamic networks

Autores
Jesus, P; Baquero, C; Almeida, PS;

Publicação
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING

Abstract
Data aggregation is a fundamental building block of modern distributed systems. Averaging based approaches, commonly designated gossip-based, are an important class of aggregation algorithms as they allow all nodes to produce a result, converge to any required accuracy, and work independently from the network topology. However, existing approaches exhibit many dependability issues when used in faulty and dynamic environments. This paper describes and evaluates a fault tolerant distributed aggregation technique, Flow Updating, which overcomes the problems in previous averaging approaches and is able to operate on faulty dynamic networks. Experimental results show that this novel approach outperforms previous averaging algorithms; it self-adapts to churn and input value changes without requiring any periodic restart, supporting node crashes and high levels of message loss, and works in asynchronous networks. Realistic concerns have been taken into account in evaluating Flow Updating, like the use of unreliable failure detectors and asynchrony, targeting its application to realistic environments.

FecharLer Abstract

2015

Efficient State-Based CRDTs by Delta-Mutation

Autores
Almeida, PS; Shoker, A; Baquero, C;

Publicação
NETYS

Abstract
CRDTs are distributed data types that make eventual consistency of a distributed object possible and non ad-hoc. Specifically, state-based CRDTs ensure convergence through disseminating the entire state, that may be large, and merging it to other replicas; whereas operation-based CRDTs disseminate operations (i.e., small states) assuming an exactly-once reliable dissemination layer. We introduce Delta State Conflict-Free Replicated Datatypes (d-CRDT) that can achieve the best of both worlds: small messages with an incremental nature, disseminated over unreliable communication channels. This is achieved by defining d-mutators to return a delta-state, typically with a much smaller size than the full state, that is joined to both: local and remote states. We introduce the d-CRDT framework, and we explain it through establishing a correspondence to current state-based CRDTs. In addition, we present an anti-entropy algorithm that ensures causal consistency, and two d-CRDT specifications of well-known replicated datatypes.

FecharLer Abstract

2015

A Survey of Distributed Data Aggregation Algorithms

Autores
Jesus, P; Baquero, C; Almeida, PS;

Publicação
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS

Abstract
Distributed data aggregation is an important task, allowing the decentralized determination of meaningful global properties, which can then be used to direct the execution of other applications. The resulting values are derived by the distributed computation of functions like COUNT, SUM, and AVERAGE. Some application examples deal with the determination of the network size, total storage capacity, average load, majorities and many others. In the last decade, many different approaches have been proposed, with different trade-offs in terms of accuracy, reliability, message and time complexity. Due to the considerable amount and variety of aggregation algorithms, it can be difficult and time consuming to determine which techniques will be more appropriate to use in specific settings, justifying the existence of a survey to aid in this task. This work reviews the state of the art on distributed data aggregation algorithms, providing three main contributions. First, it formally defines the concept of aggregation, characterizing the different types of aggregation functions. Second, it succinctly describes the main aggregation techniques, organizing them in a taxonomy. Finally, it provides some guidelines toward the selection and use of the most relevant techniques, summarizing their principal characteristics.

FecharLer Abstract

2015

Proceedings of the First Workshop on Principles and Practice of Consistency for Distributed Data, PaPoC@EuroSys 2015, Bordeaux, France, April 21, 2015

Autores
Baquero, C; Serafini, M;

Publicação
PaPoC@EuroSys

Abstract

2015

Embedding, Evolution, and Validation of Model-Driven Spreadsheets

Autores
Cunha, J; Fernandes, JP; Mendes, J; Saraiva, J;

Publicação
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING

Abstract
This paper proposes and validates a model-driven software engineering technique for spreadsheets. The technique that we envision builds on the embedding of spreadsheet models under a widely used spreadsheet system. This means that we enable the creation and evolution of spreadsheet models under a spreadsheet system. More precisely, we embed ClassSheets, a visual language with a syntax similar to the one offered by common spreadsheets, that was created with the aim of specifying spreadsheets. Our embedding allows models and their conforming instances to be developed under the same environment. In practice, this convenient environment enhances evolution steps at the model level while the corresponding instance is automatically co-evolved. Finally, we have designed and conducted an empirical study with human users in order to assess our technique in production environments. The results of this study are promising and suggest that productivity gains are realizable under our model-driven spreadsheet development setting.

FecharLer Abstract