Cookies Policy
We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out More
Close
  • Menu
About

About

Paulo Sérgio Almeida is an Assistant Professor at the Department of Informatics at University of Minho, and a researcher at HASLab / INESC TEC. He obtained a MSc degree from University of Porto in 1994 and a PhD degree in Computer Science from Imperial College London in 1998. His research activities have been in the area of distributed systems, namely in causality tracking mechanisms, eventually consistent non-relational databases, fault-tolerant distributed aggregation algorithms, bloom filters, and distributed algorithms in graphs. In recent years the main focus of research has been on Conflict-free Replicated Data Types.

Interest
Topics
Details

Details

  • Name

    Paulo Sérgio Almeida
  • Cluster

    Computer Science
  • Role

    Senior Researcher
  • Since

    01st November 2011
002
Publications

2017

Scalable eventually consistent counters over unreliable networks

Authors
Almeida, PS; Baquero, C;

Publication
Distributed Computing

Abstract

2015

Flow updating: Fault-tolerant aggregation for dynamic networks

Authors
Jesus, P; Baquero, C; Almeida, PS;

Publication
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING

Abstract
Data aggregation is a fundamental building block of modern distributed systems. Averaging based approaches, commonly designated gossip-based, are an important class of aggregation algorithms as they allow all nodes to produce a result, converge to any required accuracy, and work independently from the network topology. However, existing approaches exhibit many dependability issues when used in faulty and dynamic environments. This paper describes and evaluates a fault tolerant distributed aggregation technique, Flow Updating, which overcomes the problems in previous averaging approaches and is able to operate on faulty dynamic networks. Experimental results show that this novel approach outperforms previous averaging algorithms; it self-adapts to churn and input value changes without requiring any periodic restart, supporting node crashes and high levels of message loss, and works in asynchronous networks. Realistic concerns have been taken into account in evaluating Flow Updating, like the use of unreliable failure detectors and asynchrony, targeting its application to realistic environments.

2015

Efficient State-Based CRDTs by Delta-Mutation

Authors
Almeida, PS; Shoker, A; Baquero, C;

Publication
Networked Systems - Third International Conference, NETYS 2015, Agadir, Morocco, May 13-15, 2015, Revised Selected Papers

Abstract
CRDTs are distributed data types that make eventual consistency of a distributed object possible and non ad-hoc. Specifically, state-based CRDTs ensure convergence through disseminating the entire state, that may be large, and merging it to other replicas; whereas operation-based CRDTs disseminate operations (i.e., small states) assuming an exactly-once reliable dissemination layer. We introduce Delta State Conflict-Free Replicated Datatypes (d-CRDT) that can achieve the best of both worlds: small messages with an incremental nature, disseminated over unreliable communication channels. This is achieved by defining d-mutators to return a delta-state, typically with a much smaller size than the full state, that is joined to both: local and remote states. We introduce the d-CRDT framework, and we explain it through establishing a correspondence to current state-based CRDTs. In addition, we present an anti-entropy algorithm that ensures causal consistency, and two d-CRDT specifications of well-known replicated datatypes. © Springer International Publishing Switzerland 2015.

2015

A Survey of Distributed Data Aggregation Algorithms

Authors
Jesus, P; Baquero, C; Almeida, PS;

Publication
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS

Abstract
Distributed data aggregation is an important task, allowing the decentralized determination of meaningful global properties, which can then be used to direct the execution of other applications. The resulting values are derived by the distributed computation of functions like COUNT, SUM, and AVERAGE. Some application examples deal with the determination of the network size, total storage capacity, average load, majorities and many others. In the last decade, many different approaches have been proposed, with different trade-offs in terms of accuracy, reliability, message and time complexity. Due to the considerable amount and variety of aggregation algorithms, it can be difficult and time consuming to determine which techniques will be more appropriate to use in specific settings, justifying the existence of a survey to aid in this task. This work reviews the state of the art on distributed data aggregation algorithms, providing three main contributions. First, it formally defines the concept of aggregation, characterizing the different types of aggregation functions. Second, it succinctly describes the main aggregation techniques, organizing them in a taxonomy. Finally, it provides some guidelines toward the selection and use of the most relevant techniques, summarizing their principal characteristics.

2014

Making Operation-Based CRDTs Operation-Based

Authors
Baquero, C; Almeida, PS; Shoker, A;

Publication
DISTRIBUTED APPLICATIONS AND INTEROPERABLE SYSTEMS (DAIS 2014)

Abstract
Conflict-free Replicated Datatypes (CRDT) are usually classified as either state-based or operation-based. However, the standard definition of op-based CRDTs is very encompassing, allowing even sending the full-state, blurring the distinction. We introduce pure op-based CRDTs, that can only send operations to other replicas, drawing a clear distinction from state-based ones. Datatypes with commutative operations can be trivially implemented as pure op-based CRDTs using standard reliable causal delivery. We propose an extended API - tagged reliable causal broadcast - that provides causality information upon delivery, and show how it can be used to also implement other datatypes having non-commutative operations, through the use of a PO-Log - a partially ordered log of operations - inside the datatype. A semanticallybased PO-Log compaction framework, using both causality and what we denote by causal stability, allows obtaining very compact replica state for pure op-based CRDTs, while also benefiting from small message sizes.

2011

Fast Distributed Estimation of Sums and Network Sizes

Authors
Carlos Baquero; Paulo Sérgio Almeida; Raquel Menezes

Publication
IEEE Transactions on Parallel and Distributed Systems - IEEE Transactions on Parallel and Distributed Systems

Abstract

Supervised
thesis

2016

Beyond Distributed Transactions through Exactly-Once Exchanges

Author
Ziad Kassam

Institution
UM

2015

Eventually Consistent Distributed Data Structure Servers

Author
Ricardo Jorge Tomé Gonçalves

Institution
UM