Publications

Publications by Francisco Almeida Maia

2013

Slicing as a Distributed Systems Primitive

Authors
Maia, F; Matos, M; Oliveira, R; Riviere, E;

Publication
2013 SIXTH LATIN-AMERICAN SYMPOSIUM ON DEPENDABLE COMPUTING (LADC)

Abstract
Large-scale distributed systems appear as the major infrastructures for supporting planet-scale services. These systems call for appropriate management mechanisms and protocols. Slicing is an example of an autonomous, fully decentralized protocol suitable for large-scale environments. It aims at organizing the system into groups of nodes, called slices, according to an application-specific criteria where the size of each slice is relative to the size of the full system. This allows assigning a certain fraction of nodes to different task, according to their capabilities. Although useful, current slicing techniques lack some features of considerable practical importance. This paper proposes a slicing protocol, that builds on existing solutions, and addresses some of their frailties. We present novel solutions to deal with non-uniform slices and to perform online and dynamic slices schema reconfiguration. Moreover, we describe how to provision a slice-local Peer Sampling Service for upper protocol layers and how to enhance slicing protocols with the capability of slicing over more than one attribute. Slicing is presented as a complete, dependable and integrated distributed systems primitive for large-scale systems.

CloseRead Abstract

2014

Workload-aware table splitting for NoSQL

Authors
Cruz, F; Maia, F; Oliveira, R; Vilaca, R;

Publication
Proceedings of the ACM Symposium on Applied Computing

Abstract
Massive scale data stores, which exhibit highly desirable scalability and availability properties are becoming pivotal systems in nowadays infrastructures. Scalability achieved by these data stores is anchored on data independence; there is no clear relationship between data, and atomic inter-node operations are not a concern. Such assumption over data allows aggressive data partitioning. In particular, data tables are horizontally partitioned and spread across nodes for load balancing. However, in current versions of these data stores, partitioning is either a manual process or automated but simply based on table size. We argue that size based partitioning does not lead to acceptable load balancing as it ignores data access patterns, namely data hotspots. Moreover, manual data partitioning is cumbersome and typically infeasible in large scale scenarios. In this paper we propose an automated table splitting mechanism that takes into account the system workload. We evaluate such mechanism showing that it simple, non-intrusive and effective. Copyright 2014 ACM.

CloseRead Abstract

2017

Data Management and Privacy in a World of Data Wealth

Authors
Maia, F;

Publication
13th European Dependable Computing Conference, EDCC 2017, Geneva, Switzerland, September 4-8, 2017

Abstract

2014

Autonomous Multi-dimensional Slicing for Large-Scale Distributed Systems

Authors
Pasquet, M; Maia, F; Riviere, E; Schiavoni, V;

Publication
DISTRIBUTED APPLICATIONS AND INTEROPERABLE SYSTEMS (DAIS 2014)

Abstract
Slicing is a distributed systems primitive that allows to autonomously partition a large set of nodes based on node-local attributes. Slicing is decisive for automatically provisioning system resources for different services, based on their requirements or importance. One of the main limitations of existing slicing protocols is that only single dimension attributes are considered for partitioning. In practical settings, it is often necessary to consider best compromises for an ensemble of metrics. In this paper we propose an extension of the slicing primitive that allows multi-attribute distributed systems slicing. Our protocol employs a gossip-based approach that does not require centralized knowledge and allows self-organization. It leverages the notion of domination between nodes, forming a partial order between multi-dimensional points, in a similar way to SkyLine queries for databases. We evaluate and demonstrate the interest of our approach using large-scale simulations.

CloseRead Abstract

2018

Proceedings of the 1st Workshop on Privacy by Design in Distributed Systems, P2DS@EuroSys 2018, Porto, Portugal, April 23, 2018

Authors
Maia, F; Mercier, H; Brito, A;

Publication
P2DS@EuroSys

Abstract

2018

Totally Ordered Replication for Massive Scale Key-Value Stores

Authors
Ribeiro, J; Machado, N; Maia, F; Matos, M;

Publication
Distributed Applications and Interoperable Systems - 18th IFIP WG 6.1 International Conference, DAIS 2018, Held as Part of the 13th International Federated Conference on Distributed Computing Techniques, DisCoTec 2018, Madrid, Spain, June 18-21, 2018, Proceedings

Abstract
Scalability is one of the most relevant features of today’s data management systems. In order to achieve high scalability and availability, recent distributed key-value stores refrain from costly replica coordination when processing requests. However, these systems typically do not perform well under churn. In this paper, we propose DataFlagons, a large-scale key-value store that integrates epidemic dissemination with a probabilistic total order broadcast algorithm. By ensuring that all replicas process requests in the same order, DataFlagons provides probabilistic strong data consistency while achieving high scalability and robustness under churn. © 2018, IFIP International Federation for Information Processing.

CloseRead Abstract