Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
About
Download Photo HD

About

I am a researcher at HASLab and professor at the U. Minho. My research focuses on dependable distributed systems. I am interested mainly in data management, including database replication and SQL processing over NoSQL systems, and in  group communication, including consensus and gossip-based protocols for large-scale systems. I am also interested in tools for testing, evaluating, and monitoring dependable distributed systems.

Interest
Topics
Details

Details

  • Name

    José Orlando Pereira
  • Cluster

    Computer Science
  • Role

    Senior Researcher
  • Since

    01st November 2011
005
Publications

2021

BDUS: implementing block devices in user space

Authors
Faria, A; Macedo, R; Pereira, J; Paulo, J;

Publication
SYSTOR '21: The 14th ACM International Systems and Storage Conference, Haifa, Israel, June 14-16, 2021.

Abstract

2021

Detailed Black-Box Monitoring of Distributed Systems

Authors
Neves, F; Vilaca, R; Pereira, J;

Publication
APPLIED COMPUTING REVIEW

Abstract
Modern containerized distributed systems, such as big data storage and processing stacks or micro-service based applications, are inherently hard to monitor and optimize, as resource usage does not directly match hardware resources due to multiple virtualization layers. For instance, inter-application traffic is an important factor in as it directly indicates how components interact, it has not been possible to accurately monitor it in an application independent way and without severe overhead, thus putting it out of reach of cloud platforms. In this paper we present an efficient black-box monitoring approach for gathering detailed structural information of collaborating processes in a distributed system that can be queried for various purposes, as it includes both information about processes, containers, and hosts, as well as resource usage and amount of data exchanged. The key to achieving high detail and low overhead without custom application instrumentation is to use a kernel-aided event driven strategy. We validate a prototype implementation by applying it to multi-platform microservice deployments, evaluate its performance with micro-benchmarks, and demonstrate its usefulness for container placement in a distributed data storage and processing stack (i.e., Cassandra and Spark).

2021

Horus: Non-Intrusive Causal Analysis of Distributed Systems Logs

Authors
Neves, F; Machado, N; Vilaca, R; Pereira, J;

Publication
51ST ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN 2021)

Abstract
Logs are still the primary resource for debugging distributed systems executions. Complexity and heterogeneity of modern distributed systems, however, make log analysis extremely challenging. First, due to the sheer amount of messages, in which the execution paths of distinct system components appear interleaved. Second, due to unsynchronized physical clocks, simply ordering the log messages by timestamp does not suffice to obtain a causal trace of the execution. To address these issues, we present Horus, a system that enables the refinement of distributed system logs in a causally-consistent and scalable fashion. Horus leverages kernel-level probing to capture events for tracking causality between application-level logs from multiple sources. The events are then encoded as a directed acyclic graph and stored in a graph database, thus allowing the use of rich query languages to reason about runtime behavior. Our case study with TrainTicket, a ticket booking application with 40+ microservices, shows that Horus surpasses current widely-adopted log analysis systems in pinpointing the root cause of anomalies in distributed executions. Also, we show that Horus builds a causally-consistent log of a distributed execution with much higher performance (up to 3 orders of magnitude) and scalability than prior state-of-the-art solutions. Finally, we show that Horus' approach to query causality is up to 30 times faster than graph database built-in traversal algorithms.

2021

Totally-Ordered Prefix Parallel Snapshot Isolation

Authors
Faria, N; Pereira, J;

Publication
PaPoC@EuroSys 2021, 8th Workshop on Principles and Practice of Consistency for Distributed Data, Online Event, United Kingdom, April 26, 2021

Abstract
Distributed data management systems have increasingly been using variants of Snapshot Isolation (SI) as their transactional isolation criteria as it combines strong ACID guarantees with non-blocking reads and scalability. However, most existing proposals are limited by the performance of update propagation and stability detection, in particular, when execution and storage are disaggregated. In this paper, we propose TOPSI, an approach providing a restricted form of Parallel Snapshot Isolation (PSI) that allows partially ordering recent transactions to avoid waiting for remote updates or using a stale snapshot. Moreover, it has the interesting property of making a prefix of history in all sites converge to a common total order. This allows versions to be represented by a single scalar timestamp for certification and storage in a shared store. We demonstrate the impact on throughput and abort rate with a proof-of-concept implementation and the industry-standard TPC-C benchmark. © 2021 ACM.

2020

Black-box inter-application traffic monitoring for adaptive container placement

Authors
Neves, F; Vilaca, R; Pereira, J;

Publication
Proceedings of the 35th Annual ACM Symposium on Applied Computing

Abstract

Supervised
thesis

2020

Holistic performance and scalability analysis for large scale distributed systems

Author
Francisco Nuno Teixeira Neves

Institution
UP-FCUP

2020

High performance data processing

Author
Nuno Filipe Pinto Faria

Institution
UM

2019

Holistic performance and scalability analysis for large scale distributed systems

Author
Francisco Nuno Teixeira Neves

Institution
UP-FCUP

2019

Towards a Dependable and Decentralized Software-Defined Storage Architecture

Author
Ricardo Gonçalves Macedo

Institution
UP-FCUP

2019

Adaptive consensus for the blockchain

Author
Ricardo António Gonçalves Pereira

Institution
UM