Publications

Publications by Paulo Sérgio Almeida

2010

Dependability in Aggregation by Averaging

Authors
Jesus, Paulo; Baquero, Carlos; Almeida, PauloSergio;

Publication
CoRR

Abstract

2010

Dotted Version Vectors: Logical Clocks for Optimistic Replication

Authors
Preguiça, NunoM.; Baquero, Carlos; Almeida, PauloSergio; Fonte, Victor; Gonçalves, Ricardo;

Publication
CoRR

Abstract

2000

Panasync: dependency tracking among file copies

Authors
Almeida, PS; Baquero, C; Fonte, V;

Publication
Proceedings of the ACM SIGOPS European Workshop, Kolding, Denmark, September 17-20, 2000

Abstract

2012

Brief announcement: Efficient causality tracking in distributed storage systems with dotted version vectors

Authors
Preguica, N; Bauqero, C; Almeida, PS; Fonte, V; Goncalves, R;

Publication
Proceedings of the Annual ACM Symposium on Principles of Distributed Computing

Abstract
Version vectors (VV) are used pervasively to track dependencies between replica versions in multi-version distributed storage systems. In these systems, VV tend to have a dual functionality: identify a version and encode causal dependencies. In this paper, we show that by maintaining the identifier of the version separate from the causal past, it is possible to verify causality in constant time (instead of O(n) for VV) and to precisely track causality with information with size bounded by the degree of replication, and not by the number of concurrent writers. © 2012 Authors.

CloseRead Abstract

2023

A Case for Partitioned Bloom Filters

Authors
Almeida, PS;

Publication
IEEE TRANSACTIONS ON COMPUTERS

Abstract
In a partitioned Bloom Filter (PBF) the bit vector is split into disjoint parts, one per hash function. Contrary to hardware designs, where they prevail, software implementations mostly ignore PBFs, considering them worse than standard Bloom filters (SBF), due to the slightly larger false positive rate (FPR). In this paper, by performing an in-depth analysis, first we show that the FPR advantage of SBFs is smaller than thought; more importantly, by deriving the per-element FPR, we show that SBFs have weak spots in the domain: elements that test as false positives much more frequently than expected. This is relevant in scenarios where an element is tested against many filters. Moreover, SBFs are prone to exhibit extremely weak spots if naive double hashing is used, something occurring in mainstream libraries. PBFs exhibit a uniform distribution of the FPR over the domain, with no weak spots, even using naive double hashing. Finally, we survey scenarios beyond set membership testing, identifying many advantages of having disjoint parts, in designs using SIMD techniques, for filter size reduction, test of set disjointness, and duplicate detection in streams. PBFs are better, and should replace SBFs, in general purpose libraries and as the base for novel designs.

CloseRead Abstract

2020

Age-Partitioned Bloom Filters

Authors
Shtul, A; Baquero, C; Almeida, PS;

Publication
CoRR

Abstract