2021
Autores
Pereira, A; Proenca, A;
Publicação
COMPUTER PHYSICS COMMUNICATIONS
Abstract
Software to analyse very large sets of experimental data often relies on a pipeline of irregular computational tasks with decisions to remove irrelevant data from further processing. A user-centred framework was designed and deployed, HEP-Frame, which aids domain experts to develop applications for scientific data analyses and to monitor and control their efficient execution. The key feature of HEP-Frame is the performance portability of the code across different heterogeneous platforms, due to a novel adaptive multi-layer scheduler, seamlessly integrated into the tool, an approach not available in competing frameworks. The multi-layer scheduler transparently allocates parallel data/tasks across the available heteroge-neous resources, dynamically balances threads among data input and computational tasks, adaptively reorders in run-time the parallel execution of the pipeline stages for each data stream, respecting data dependencies, and efficiently manages the execution of library functions in accelerators. Each layer implements a specific scheduling strategy: one balances the execution of the computational stages of the pipeline, distributing the execution of the stages of the same or different dataset elements among the available computing threads; another controls the order of the pipeline stages execution, so that most data is filtered out earlier and later stages execute the computationally heavy tasks; yet another adaptively balances the automatically created threads among data input and the computational tasks, taking into account the requirements of each application. Simulated data analyses from sensors in the ATLAS Experiment at CERN evaluated the scheduler efficiency, on dual multicore Xeon servers with and without accelerators, and on servers with the many-core Intel KNL. Experimental results show significant improved performance of these data analyses due to HEP-Frame features and the codes scaled well on multiple servers. Results also show the improved HEP-Frame scheduler performance over the key competitor, the HEFT list scheduler. The best overall performance improvement over a real fine tuned sequential data analysis was impressive in both homogeneous and heterogeneous multicore servers and in many-core servers: 81x faster in the homogeneous 24+24 core Skylake server, 86x faster in the heterogeneous 12+12 core Ivy Bridge server with the Kepler GPU, and 252x faster in the 64-core KNL server. Program summary Program Title: HEP-Frame CPC Library link to program files: https://doi.org/10.17632/m2jwxshtfz.1 Licencing provisions: GPLv3 Programming language: C++. Supplementary material: The current HEP-Frame public release available at https://bitbucket.org/ ampereira/hep-frame/wiki/Home . Nature of problem: Scientific data analysis applications are often developed to process large amounts of data obtained through experimental measurements or Monte Carlo simulations, aiming to identify patterns in the data or to test and/or validate theories. These large inputs are usually processed by a pipeline of computational tasks that may filter out irrelevant data (a task and its filter is addressed as a proposition in this communication), preventing it from being processed by subsequent tasks in the pipeline. This data filtering, coupled with the fact that propositions may have different computational intensities, contribute to the irregularity of the pipeline execution. This can lead to scientific data analyses I/O-, memory-, or compute-bound performance limitations, depending on the implemented algorithms and input data. To allow scientists to process more data with more accurate results their code and data structures should be optimized for the computing resources they can access. Since the main goal of most scientists is to obtain results relevant to their scientific fields, often within strict deadlines, optimizing the performance of their applications is very time consuming and is usually overlooked. Scientists require a software framework to aid the design and development of efficient applications and to control their parallel execution on distinct computing platforms. Solution method: This work proposes HEP-Frame, a framework to aid the development and efficient execution of pipelined scientific analysis applications on homogeneous and heterogeneous servers. HEP-Frame is a user-centred framework to aid scientists to develop applications to analyse data from a large number of dataset elements, with a flexible pipeline of propositions. It not only stresses the interface to domain experts so that code is more robust and is developed faster, but it also aims high-performance portability across different types of parallel computing platforms and desirable sustainability features. This framework aims to provide efficient parallel code execution without requiring user expertise in parallel computing. Frameworks to aid the design and deployment of scientific code usually fall into two categories: (i) resource-centred, closer to the computing platforms, where execution efficiency and performance portability are the main goals, but forces developers to adapt their code to strict framework con-straints; (ii) user-centred, which stresses the interface to domain experts to improve their code development speed and robustness, aiming to provide desirable sustainability features but disregarding the execution performance. There are also a set of frameworks that merge these two categories (Liu et al., 2015 [1]; Deelman et al., 2015 [2]) for scientific computing. While they do not have steep learning curves, concessions have to be made to their ease of use to allow for their broader scope of targeted applications. HEP-Frame attempts to merge this gap, placing itself between a fully user-or resource-centred framework, so that users develop code quickly and do not have to worry about the computational efficiency of the code It handles (i) by ensuring efficient execution of applications according to their computational requirements and the available resources on the server through a multi-layer scheduler, while (ii) is addressed by automatically generating code skeletons and transparently managing the data structure and automating repetitive tasks. Additional comments: An early stage proof-of-concept was published in a conference proceedings (Pereira et al., 2015). However, the HEP-Frame version presented in this communication only shares a very small portion of the code related to the skeleton generation (less than 5% of the overall code), while the rest of the user interface, multi-layer scheduler, and parallelization strategies were completely redesigned and re-implemented.
2020
Autores
Silva, F; Alonso, AN; Pereira, J; Oliveira, R;
Publicação
DAIS
Abstract
The performance and scalability of byzantine fault-tolerant (BFT) protocols for state machine replication (SMR) have recently come under scrutiny due to their application in the consensus mechanism of blockchain implementations. This led to a proliferation of proposals that provide different trade-offs that are not easily compared as, even if these are all based on message passing, multiple design and implementation factors besides the message exchange pattern differ between each of them. In this paper we focus on the impact of different combinations of cryptographic primitives and the message exchange pattern used to collect and disseminate votes, a key aspect for performance and scalability. By measuring this aspect in isolation and in a common framework, we characterise the design space and point out research directions for adaptive protocols that provide the best trade-off for each environment and workload combination.
2020
Autores
Carvalho, H; Cruz, D; Pontes, R; Paulo, J; Oliveira, R;
Publicação
DAIS
Abstract
Cloud Computing services for data analytics are increasingly being sought by companies to extract value from large quantities of information. However, processing data from individuals and companies in third-party infrastructures raises several privacy concerns. To this end, different secure analytics techniques and systems have recently emerged. These initial proposals leverage specific cryptographic primitives lacking generality and thus having their application restricted to particular application scenarios. In this work, we contribute to this thriving body of knowledge by combining two complementary approaches to process sensitive data. We present SafeSpark, a secure data analytics framework that enables the combination of different cryptographic processing techniques with hardware-based protected environments for privacy-preserving data storage and processing. SafeSpark is modular and extensible therefore adapting to data analytics applications with different performance, security and functionality requirements. We have implemented a SafeSpark’s prototype based on Spark SQL and Intel SGX hardware. It has been evaluated with the TPC-DS Benchmark under three scenarios using different cryptographic primitives and secure hardware configurations. These scenarios provide a particular set of security guarantees and yield distinct performance impact, with overheads ranging from as low as 10% to an acceptable 300% when compared to an insecure vanilla deployment of Apache Spark.
2020
Autores
Pereira, JC; Machado, N; Pinto, JS;
Publicação
TAP@STAF
Abstract
Data races, a condition where two memory accesses to the same memory location occur concurrently, have been shown to be a major source of concurrency bugs in distributed systems. Unfortunately, data races are often triggered by non-deterministic event orderings that are hard to detect when testing complex distributed systems. In this paper, we propose Spider, an automated tool for identifying data races in distributed system traces. Spider encodes the causal relations between the events in the trace as a symbolic constraint model, which is then fed into an SMT solver to check for the presence of conflicting concurrent accesses. To reduce the constraint solving time, Spider employs a pruning technique aimed at removing redundant portions of the trace. Our experiments with multiple benchmarks show that Spider is effective in detecting data races in distributed executions in a practical amount of time, providing evidence of its usefulness as a testing tool.
2020
Autores
de Matos, A; Leucker, M; Pereira, D; Pinto, JS;
Publicação
2020 INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF SOFTWARE ENGINEERING (TASE 2020)
Abstract
This paper introduces a synthesis procedure for the satisfiability problem of RMTL-integral formulas as SAT solving modulo theories. RMTL-integral is a real-time version of metric temporal logic (MTL) extended by a duration quantifier allowing to measure time durations. For any given formula, a SAT instance modulo the theory of arrays, uninterpreted functions with equality and non-linear real-arithmetic is synthesized and may then be further investigated using appropriate SMT solvers. We show the benefits of using RMTL-integral with the given SMT encoding on a diversified set of examples that include in particular its application in the area of schedulability analysis. Therefore, we introduce a simple language for formalizing schedulability problems and show how to formulate timing constraints as RMTL-integral formulas. Our practical evaluation based on our synthesis and Z3 as back-end SMT solver also shows the feasibility of the overall approach.
2020
Autores
Enes, V; Baquero, C; Rezende, TF; Gotsman, A; Perrin, M; Sutra, P;
Publicação
PROCEEDINGS OF THE FIFTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS (EUROSYS'20)
Abstract
Online applications now routinely replicate their data at multiple sites around the world. In this paper we present ATLAS, the first state-machine replication protocol tailored for such planet-scale systems. ATLAS does not rely on a distinguished leader, so clients enjoy the same quality of service independently of their geographical locations. Furthermore, clientperceived latency improves as we add sites closer to clients. To achieve this, ATLAS minimizes the size of its quorums using an observation that concurrent data center failures are rare. It also processes a high percentage of accesses in a single round trip, even when these conflict. We experimentally demonstrate that ATLAS consistently outperforms state-of-the-art protocols in planet-scale scenarios. In particular, ATLAS is up to two times faster than Flexible Paxos with identical failure assumptions, and more than doubles the performance of Egalitarian Paxos in the YCSB benchmark.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.