Cookies
Usamos cookies para melhorar nosso site e a sua experiência. Ao continuar a navegar no site, você aceita a nossa política de cookies. Ver mais
Fechar
  • Menu
Sobre

Sobre

Nascido em Guimarães, Portugal, em 1992. Obteve o grau de Licenciado em Engenharia Informática na  Universidade do Minho. Na mesma instituição, seguiu os estudos para o grau de Mestre em Engenharia informática, com Sistemas Distribuídos e Engenharia de Aplicações como áreas de especialização

Durante o primeiro ano do Mestrado em Engenharia Informática, iniciou a sua atividade no HASLab, um centro de investigação integrado do INESC TEC, sediado na Universidade do Minho. Durante esta atividade, concluiu a tese de mestrado intitulada "Análise de desempenho e otimização do Apache HBase para dados relacionais", que consistiu na avaliação do desempenho de uma base de dados NoSQL para dados bem estruturados.

Atualmente, é aluno de doutoramento do Programa Doutoral em Ciências da Computação MAP-i. Os principais tópicos de interesse em investigação recaem sobre análise de desempenho e de escalabilidade.

Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    Francisco Teixeira Neves
  • Cluster

    Informática
  • Cargo

    Assistente de Investigação
  • Desde

    01 março 2014
001
Publicações

2018

Falcon: A Practical Log-Based Analysis Tool for Distributed Systems

Autores
Neves, F; Machado, N; Pereira, J;

Publicação
48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2018, Luxembourg City, Luxembourg, June 25-28, 2018

Abstract
Programmers and support engineers typically rely on log data to narrow down the root cause of unexpected behaviors in dependable distributed systems. Unfortunately, the inherently distributed nature and complexity of such distributed executions often leads to multiple independent logs, scattered across different physical machines, with thousands or millions entries poorly correlated in terms of event causality. This renders log-based debugging a tedious, time-consuming, and potentially inconclusive task. We present Falcon, a tool aimed at making log-based analysis of distributed systems practical and effective. Falcon's modular architecture, designed as an extensible pipeline, allows it to seamlessly combine several distinct logging sources and generate a coherent space-time diagram of distributed executions. To preserve event causality, even in the presence of logs collected from independent unsynchronized machines, Falcon introduces a novel happens-before symbolic formulation and relies on an off-the-shelf constraint solver to obtain a coherent event schedule. Our case study with the popular distributed coordination service Apache Zookeeper shows that Falcon eases the log-based analysis of complex distributed protocols and is helpful in bridging the gap between protocol design and implementation. © 2018 IEEE.

2017

DDFlasks: Deduplicated Very Large Scale Data Store

Autores
Maia, F; Paulo, J; Coelho, F; Neves, F; Pereira, J; Oliveira, R;

Publicação
Distributed Applications and Interoperable Systems - 17th IFIP WG 6.1 International Conference, DAIS 2017, Held as Part of the 12th International Federated Conference on Distributed Computing Techniques, DisCoTec 2017, Neuchâtel, Switzerland, June 19-22, 2017, Proceedings

Abstract
With the increasing number of connected devices, it becomes essential to find novel data management solutions that can leverage their computational and storage capabilities. However, developing very large scale data management systems requires tackling a number of interesting distributed systems challenges, namely continuous failures and high levels of node churn. In this context, epidemic-based protocols proved suitable and effective and have been successfully used to build DataFlasks, an epidemic data store for massive scale systems. Ensuring resiliency in this data store comes with a significant cost in storage resources and network bandwidth consumption. Deduplication has proven to be an efficient technique to reduce both costs but, applying it to a large-scale distributed storage system is not a trivial task. In fact, achieving significant space-savings without compromising the resiliency and decentralized design of these storage systems is a relevant research challenge. In this paper, we extend DataFlasks with deduplication to design DDFlasks. This system is evaluated in a real world scenario using Wikipedia snapshots, and the results are twofold. We show that deduplication is able to decrease storage consumption up to 63% and decrease network bandwidth consumption by up to 20%, while maintaining a fullydecentralized and resilient design. © IFIP International Federation for Information Processing 2017.

2017

Prepared scan: efficient retrieval of structured data from HBase

Autores
Neves, F; Vilaça, R; Pereira, JO; Oliveira, R;

Publicação
Proceedings of the Symposium on Applied Computing, SAC 2017, Marrakech, Morocco, April 3-7, 2017

Abstract
The ability of NoSQL systems to scale better than traditional relational databases motivates a large set of applications to migrate their data to NoSQL systems, even without aiming to exploit the provided schema exibility. However, accessing structured data is costly due to such exibility, incurring in a lot of bandwidth and processing unit usage. In this paper, we analyse this cost in Apache HBase and propose a new scan operation, named Prepared Scan, that optimizes the access to data structured in a regular manner by taking advantage of a well-known schema by application. Using an industry standard benchmark, we show that Prepared Scan improves throughput up to 29% and decreases network bandwidth consumption up to 20%. © 2017 ACM.