2019
Authors
Ferreira, L; Coelho, F; Alonso, AN; Pereira, J;
Publication
CLOSER: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE
Abstract
In the context of the CloudDBAppliance (CDBA) project, fault tolerance and high-availability are provided in layers: within each appliance, within a data centre and between data centres. This paper presents the proposed replication architecture for providing fault tolerance and high availability within a data centre. This layer configuration, along with specific deployment constraints require a custom replication architecture. In particular, replication must be implemented at the middleware-level, to avoid constraining the backing operational database. This paper is focused on the design of the CDBA Replication Manager along with an evaluation, using micro-benchmarking, of components for the replication middleware. Results show the impact, on both throughput and latency, of the replication mechanisms in place.
2019
Authors
Abreu, H; Ferreira, L; Coelho, F; Alonso, AN; Pereira, J;
Publication
PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA)
Abstract
In the context of the CloudDBAppliance (CDBA) project, fault tolerance and high-availability are provided in layers: within each appliance, within a data centre and between datacentres. This paper presents the recovery mechanisms in place to fulfill the provision of high-availability within a datacentre. The recovery mechanism takes advantage of CDBA's in-middleware replication mechanism to bring failed replicas up-to-date. Along with the description of different variants of the recovery mechanism, this paper provides their comparative evaluation, focusing on the time it takes to recover a failed replica and how the recovery process impacts throughput.
2020
Authors
Ferreira, L; Coelho, F; Pereira, J;
Publication
Distributed Applications and Interoperable Systems - 20th IFIP WG 6.1 International Conference, DAIS 2020, Held as Part of the 15th International Federated Conference on Distributed Computing Techniques, DisCoTec 2020, Valletta, Malta, June 15-19, 2020, Proceedings
Abstract
Fault-tolerance is a core feature in distributed database systems, particularly the ones deployed in cloud environments. The dependability of these systems often relies in middleware components that abstract the DBMS logic from the replication itself. The highly configurable nature of these systems makes their throughput very dependent on the correct tuning for a given workload. Given the high complexity involved, machine learning techniques are often considered to guide the tuning process and decompose the relations established between tuning variables. This paper presents a machine learning mechanism based on reinforcement learning that attaches to a hybrid replication middleware connected to a DBMS to dynamically live-tune the configuration of the middleware according to the workload being processed. Along with the vision for the system, we present a study conducted over a prototype of the self-tuned replication middleware, showcasing the achieved performance improvements and showing that we were able to achieve an improvement of 370.99% on some of the considered metrics. © IFIP International Federation for Information Processing 2020.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.