Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by António Luís Sousa

2013

Adaptive query processing in cloud database systems

Authors
Costa, CM; Sousa, AL;

Publication
Proceedings - 2013 IEEE 3rd International Conference on Cloud and Green Computing, CGC 2013 and 2013 IEEE 3rd International Conference on Social Computing and Its Applications, SCA 2013

Abstract
In cloud environments, resources should be acquired and released automatically and quickly at runtime. Thereby, the implementation of traditional query optimization strategies in cloud platforms can have a poor performance, because they cannot predict future availability and/or release of resources. In such scenarios, adaptive query processing can adapt itself to the available resources to run queries and, consequently, present an acceptable performance in response to a query. However, traditional and adaptive query optimizers main objective is to reduce response time. Moreover, in the context of cloud computing, users and providers of services expect to get answers in time to guarantee the SLA. Therefore, we propose a framework that uses adaptive query processing based on heuristic rules and cost of failing the SLA. It will be implemented on structured data, considering that some cloud computing platforms support SQL queries directly or indirectly, which makes this problem relevant. © 2013 IEEE.

2013

Towards an accurate evaluation of deduplicated storage systems

Authors
Paulo, J; Reis, P; Pereira, J; Sousa, A;

Publication
COMPUTER SYSTEMS SCIENCE AND ENGINEERING

Abstract
Deduplication has proven to be a valuable technique for eliminating duplicate data in backup and archival systems and is now being applied to new storage environments with distinct requirements and performance trade-offs. Namely, deduplication system are now targeting large-scale cloud computing storage infrastructures holding unprecedented data volumes with a significant share of duplicate content. It is however hard to assess the usefulness of deduplication in particular settings and what techniques provide the best results. In fact, existing disk I/O benchmarks follow simplistic approaches for generating data content leading to unrealistic amounts of duplicates that do not evaluate deduplication systems accurately. Moreover, deduplication systems are now targeting heterogeneous storage environments, with specific duplication ratios, that benchmarks must also simulate. We address these issues with DEDISbench, a novel micro-benchmark for evaluating disk I/O performance of block based deduplication systems. As the main contribution, DEDISbench generates content by following realistic duplicate content distributions extracted from real datasets. Then, as a second contribution, we analyze and extract the duplicates found on three real storage systems, proving that DEDISbench can easily simulate several workloads. The usefulness of DEDISbench is shown by comparing it with Bonnie++ and IOzone open-source disk I/O micro-benchmarks on assessing two open-source deduplication systems, Opendedup and Lessfs, using Ext4 as a baseline. Our results lead to novel insight on the performance of these file systems.

2015

Service Response Time Measurement Model of Service Level Agreements in Cloud Environment

Authors
Costa, CM; Maia Leite, CRM; Sousa, AL;

Publication
2015 IEEE INTERNATIONAL CONFERENCE ON SMART CITY/SOCIALCOM/SUSTAINCOM (SMARTCITY)

Abstract
In cloud environments, resources should be acquired and released automatically and quickly at runtime. Therefore, ensuring the desired QoS is a great challenge for the cloud service provider. Moreover, it increases when we have large amount of data to be manipulated in this environment. Considering that, performance is an important requirement for most customers when they migrate their applications to the cloud. In this paper, we propose a model for measuring a Service Response Time estimated for different request types on large databases available in a cloud environment. This work allows the cloud service provider and its customers establish an appropriate SLA relative to performance expected of services available in the cloud. Finally, the model was evaluated in Amazon EC2 cloud infrastructure and the TPC-DS like benchmark was used for generating a database of structured data, considering that some cloud computing platforms support SQL queries directly or indirectly. This makes the proposed solution relevant for these kind of problems.

2016

Efficient SQL Adaptive Query Processing in Cloud Databases Systems

Authors
Costa, CM; Maia Leite, CRM; Sousa, AL;

Publication
PROCEEDINGS OF THE 2016 IEEE CONFERENCE ON EVOLVING AND ADAPTIVE INTELLIGENT SYSTEMS (EAIS)

Abstract
Nowadays, many companies have migrated their applications and data to the cloud. Among other benefits of this technology, the ability to answer quickly business requirements has been one of the main motivations. Thereby, in cloud environments, resources should be acquired and released automatically and quickly at runtime. This way, to ensure QoS, the major cloud providers emphasize ensuring of availability, CPU instance and cost measure in their SLAs (Service Level Agreements). However, the QoS performance are not completely handled or inappropriately treated in SLAs. Although from the user's point of view, it is considered one of the main QoS parameters. Therefore, the aim of this work consists in development of a solution to efficient query processing on large databases available in the cloud environments. It integrates adaptive re-optimization at query runtime and their costs are based on the SRT (Service Response Time) QoS performance parameter of SLA. Finally, the solution was evaluated in Amazon EC2 cloud infrastructure and the TPC-DS like benchmark was used for generating a database.

2018

Assessment of an IoT platform for data collection and analysis for medical sensors

Authors
Rei, J; Brito, C; Sousa, A;

Publication
Proceedings - 4th IEEE International Conference on Collaboration and Internet Computing, CIC 2018

Abstract
Health facilities produce an increasing and vast amount of data that must be efficiently analyzed. New approaches for healthcare monitoring are being developed every day and the Internet of Things (IoT) came to fill the still existing void on real-time monitoring. A new generation of mechanisms and techniques are being used to facilitate the practice of medicine, promoting faster diagnosis and prevention of diseases. We proposed a system that relies on IoT for storing and monitoring medical sensors data with analytic capabilities. To this end, we chose two approaches for storing this data which were thoroughly evaluated. Apache HBase presents a higher rate of data ingestion, when collaborating with the Kaa IoT platform, than Apache Cassandra, exhibiting good performance storing unstructured data, as presented in a healthcare environment. The outcome of this system has shown the possibility of a large number of medical sensors being simultaneously connected to the same platform (6000 records sent by the second or 48 ECG sensors with a frequency of 125Hz). The results presented in this paper are promising and should be further investigated as a comprehensive system would benefit the patient's diagnosis but also the physicians. © 2018 IEEE.

2019

Electrocardiogram beat-classification based on a ResNet network

Authors
Brito, C; Machado, A; Sousa, A;

Publication
Studies in Health Technology and Informatics

Abstract
When dealing with electrocardiography (ECG) the main focus relies on the classification of the heart's electric activity and deep learning has been proving its value over the years classifying the heartbeats, exhibiting great performance when doing so. Following these assumptions, we propose a deep learning model based on a ResNet architecture with convolutional 1D layers to classify the beats into one of the 4 classes: normal, atrial premature contraction, premature ventricular contraction and others. Experimental results with MIT-BIH Arrhythmia Database confirmed that the model is able to perform well, obtaining an accuracy of 96% when using stochastic gradient descent (SGD) and 83% when using adaptive moment estimation (Adam), SGD also obtained F1-scores over 90% for the four classes proposed. A larger dataset was created and tested as unforeseen data for the trained model, proving that new tests should be done to improve the accuracy of it. © 2019 International Medical Informatics Association (IMIA) and IOS Press. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).