Publicacoes - INESC TEC

Publicações

Publicações por HASLab

2016

Welcome to EICS 2016

Autores
Luyten, K; Palanque, P; Campos, JC; Schmidt, A; Signer, B; Roussel, N;

Publicação
EICS 2016 - 8th ACM SIGCHI Symposium on Engineering Interactive Computing Systems

Abstract

2016

"Keep definition, change category" - A practical approach to state-based system calculi

Autores
Oliveira, JN; Miraldo, VC;

Publicação
JOURNAL OF LOGICAL AND ALGEBRAIC METHODS IN PROGRAMMING

Abstract
Faced with the need to quantify software (un)reliability in the presence of faults, the semantics of state-based systems is urged to evolve towards quantified (e.g. probabilistic) nondeterminism. When one is approaching such semantics from a categorical perspective, this inevitably calls for some technical elaboration, in a monadic setting. This paper proposes that such an evolution be undertaken without sacrificing the simplicity of the original (qualitative) definitions, by keeping quantification implicit rather than explicit. The approach is a monad lifting strategy whereby, under some conditions, definitions can be preserved provided the semantics moves to another category. The technique is illustrated by showing how to introduce probabilism in an existing software component calculus, by moving to a suitable category of matrices and using linear algebra in the reasoning. The paper also addresses the problem of preserving monadic strength in the move from original to target (Kleisli) categories, a topic which bears relationship to recent studies in categorial physics.

FecharLer Abstract

2016

Design of an RDMA Communication Middleware for Asynchronous Shuffling in Analytical Processing

Autores
Goncalves, RC; Pereira, J; Jimenez Peris, R;

Publicação
PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, VOL 1 (CLOSER)

Abstract
A key component in a distributed parallel analytical processing engine is shuffling, the distribution of data to multiple nodes such that the computation can be done in parallel. In this paper we describe the initial design of a communication middleware to support asynchronous shuffling of data among multiple processes on a distributed memory environment. The proposed middleware relies on RDMA (Remote Direct Memory Access) operations to transfer data, and provides basic operations to send and queue data on remote machines, and to retrieve this queued data. Preliminary results show that the RDMA-based middleware can provide a 75% reduction on communication costs, when compared with a traditional sockets implementation.

FecharLer Abstract

2016

Efficient Deduplication in a Distributed Primary Storage Infrastructure

Autores
Paulo, J; Pereira, J;

Publicação
ACM TRANSACTIONS ON STORAGE

Abstract
A large amount of duplicate data typically exists across volumes of virtual machines in cloud computing infrastructures. Deduplication allows reclaiming these duplicates while improving the cost-effectiveness of large-scale multitenant infrastructures. However, traditional archival and backup deduplication systems impose prohibitive storage overhead for virtual machines hosting latency-sensitive applications. Primary deduplication systems reduce such penalty but rely on special cluster filesystems, centralized components, or restrictive workload assumptions. Also, some of these systems reduce storage overhead by confining deduplication to off-peak periods that may be scarce in a cloud environment. We present DEDIS, a dependable and fully decentralized system that performs cluster-wide off-line deduplication of virtual machines' primary volumes. DEDIS works on top of any unsophisticated storage backend, centralized or distributed, as long as it exports a basic shared block device interface. Also, DEDIS does not rely on data locality assumptions and incorporates novel optimizations for reducing deduplication overhead and increasing its reliability. The evaluation of an open-source prototype shows that minimal I/O overhead is achievable even when deduplication and intensive storage I/O are executed simultaneously. Also, our design scales out and allows collocating DEDIS components and virtual machines in the same servers, thus, sparing the need of additional hardware.

FecharLer Abstract

2016

An RDMA Middleware for Asynchronous Multi-stage Shuffling in Analytical Processing

Autores
Goncalves, RC; Pereira, J; Jimenez Peris, R;

Publicação
DISTRIBUTED APPLICATIONS AND INTEROPERABLE SYSTEMS, DAIS 2016

Abstract
A key component in large scale distributed analytical processing is shuffling, the distribution of data to multiple nodes such that the computation can be done in parallel. In this paper we describe the design and implementation of a communication middleware to support data shuffling for executing multi-stage analytical processing operations in parallel. The middleware relies on RDMA (Remote Direct Memory Access) to provide basic operations to asynchronously exchange data among multiple machines. Experimental results show that the RDMA-based middleware developed can provide a 75% reduction of the costs of communication operations on parallel analytical processing tasks, when compared with a sockets middleware.

FecharLer Abstract

2016

The CloudMdsQL Multistore System

Autores
Kolev, B; Bondiombouy, C; Valduriez, P; Peris, RJ; Pau, R; Pereira, J;

Publicação
Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016

Abstract
The blooming of different cloud data management infrastructures has turned multistore systems to a major topic in the nowadays cloud landscape. In this demonstration, we present a Cloud Multidatastore Query Language (CloudMdsQL), and its query engine. CloudMdsQL is a functional SQL-like language, capable of querying multiple heterogeneous data stores (relational and NoSQL) within a single query that may contain embedded invocations to each data store's native query interface. The major innovation is that a CloudMdsQL query can exploit the full power of local data stores, by simply allowing some local data store native queries (e.g. a breadth-first search query against a graph database) to be called as functions, and at the same time be optimized. Within our demonstration, we focus on two use cases each involving four diverse data stores (graph, document, relational, and key-value) with its corresponding CloudMdsQL queries. The query execution flows are visualized by an embedded real-time monitoring subsystem. The users can also try out different ad-hoc queries, not necessarily in the context of the use cases. Copyright is held by the owner/author(s).

FecharLer Abstract