Cookies
Usamos cookies para melhorar nosso site e a sua experiência. Ao continuar a navegar no site, você aceita a nossa política de cookies. Ver mais
Fechar
  • Menu
Sobre

Sobre

Nuno Machado é investigador de Pós-Doutoramento no Laboratório de Software Confiável (HASLab) da Universidade do Minho e INESC-TEC. O seu tópico de investigação foca-se no desenho de sistemas distribuídos resilientes e escaláveis para análise de quantidades massivas de dados. Adicionalmente, trabalha/tem interesse em soluções que garantam privacidade em cenários de computação na nuvem e Internet-das-Coisas.

Nuno obteve o Doutoramento em Engenharia Informática e de Computadores no Instituto Superior Técnico (Universidade de Lisboa), sob orientação de Luís Rodrigues. Durante o doutoramento, trabalhou em técnicas automáticas de depuração de aplicações paralelas que permitem reproduzir erros de concorrência de forma determinística, bem como isolar as suas causas.

No verão de 2014, Nuno estagiou na Microsoft Research (Redmond), onde trabalhou com Brandon Lucia em depuração de erros de concorrência. 

Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    Nuno Almeida Machado
  • Cluster

    Informática
  • Cargo

    Investigador Colaborador Externo
  • Desde

    13 julho 2016
001
Publicações

2018

CoopREP: Cooperative record and replay of concurrency bugs

Autores
Machado, N; Romano, P; Rodrigues, L;

Publicação
Software Testing, Verification and Reliability

Abstract

2018

Falcon: A Practical Log-Based Analysis Tool for Distributed Systems

Autores
Neves, F; Machado, N; Pereira, J;

Publicação
48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2018, Luxembourg City, Luxembourg, June 25-28, 2018

Abstract
Programmers and support engineers typically rely on log data to narrow down the root cause of unexpected behaviors in dependable distributed systems. Unfortunately, the inherently distributed nature and complexity of such distributed executions often leads to multiple independent logs, scattered across different physical machines, with thousands or millions entries poorly correlated in terms of event causality. This renders log-based debugging a tedious, time-consuming, and potentially inconclusive task. We present Falcon, a tool aimed at making log-based analysis of distributed systems practical and effective. Falcon's modular architecture, designed as an extensible pipeline, allows it to seamlessly combine several distinct logging sources and generate a coherent space-time diagram of distributed executions. To preserve event causality, even in the presence of logs collected from independent unsynchronized machines, Falcon introduces a novel happens-before symbolic formulation and relies on an off-the-shelf constraint solver to obtain a coherent event schedule. Our case study with the popular distributed coordination service Apache Zookeeper shows that Falcon eases the log-based analysis of complex distributed protocols and is helpful in bridging the gap between protocol design and implementation. © 2018 IEEE.

2018

Totally Ordered Replication for Massive Scale Key-Value Stores

Autores
Ribeiro, J; Machado, N; Maia, F; Matos, M;

Publicação
Distributed Applications and Interoperable Systems - 18th IFIP WG 6.1 International Conference, DAIS 2018, Held as Part of the 13th International Federated Conference on Distributed Computing Techniques, DisCoTec 2018, Madrid, Spain, June 18-21, 2018, Proceedings

Abstract

2016

BUZZPSS: A Dependable and Adaptive Peer Sampling Service

Autores
Machado, N; Maia, F; Matos, M; Oliveira, R;

Publicação
2016 SEVENTH LATIN-AMERICAN SYMPOSIUM ON DEPENDABLE COMPUTING (LADC)

Abstract
A distributed system is often built on top of an overlay network. Overlay networks enable network topology transparency while, at the same time, can be designed to provide efficient data dissemination, load balancing, and even fault tolerance. They are constructed by defining logical links between nodes creating a node graph. In practice, this is materialized by a Peer Sampling Service (PSS) that provides references to other nodes to communicate with. Depending on the configuration of the PSS, the characteristics of the overlay can be adjusted to cope with application requirements and performance concerns. Unfortunately, overlay efficiency comes at the expense of dependability. To overcome this, one often deploys an application overlay focused on efficiency, along with a safety-net overlay to ensure dependability. However, this approach results in significant resource waste since safety-net overlays are seldom used. In this paper, we focus on safety-net overlay networks and propose an adaptable mechanism to minimize resource usage while maintaining dependability guarantees. In detail, we consider a random overlay network, known to be highly dependable, and propose BUZZPSS, a new Peer Sampling Service that is able to autonomously fine-tune its resource consumption usage according to the observed system stability. When the system is stable and connectivity is not at risk, BUZZPSS autonomously changes its behavior to save resources. Alongside, it is also able to detect system instability and act accordingly to guarantee that the overlay remains operational. Through an experimental evaluation, we show that BUZZPSS is able to autonomously adapt to the system stability levels, consuming up to 6x less resources than a static approach.

2016

Production-guided Concurrency Debugging

Autores
Machado, N; Lucia, B; Rodrigues, L;

Publicação
ACM SIGPLAN NOTICES

Abstract
Concurrency bugs that stem from schedule-dependent branches are hard to understand and debug, because their root causes imply not only different event orderings, but also changes in the control-flow between failing and non-failing executions. We present Cortex: a system that helps exposing and understanding concurrency bugs that result from schedule-dependent branches, without relying on information from failing executions. Cortex preemptively exposes failing executions by perturbing the order of events and control-flow behavior in non-failing schedules from production runs of a program. By leveraging this information from production runs, Cortex synthesizes executions to guide the search for failing schedules. Production-guided search helps cope with the large execution search space by targeting failing executions that are similar to observed non-failing executions. Evaluation on popular benchmarks shows that Cortex is able to expose failing schedules with only a few perturbations to non-failing executions, and takes a practical amount of time.