Publications

Publications by CSE

2020

A Survey and Classification of Software-Defined Storage Systems

Authors
Macedo, R; Paulo, J; Pereira, J; Bessani, A;

Publication
ACM COMPUTING SURVEYS

Abstract
The exponential growth of digital information is imposing increasing scale and efficiency demands on modern storage infrastructures. As infrastructure complexity increases, so does the difficulty in ensuring quality of service, maintainability, and resource fairness, raising unprecedented performance, scalability, and programmability challenges. Software-Defined Storage (SDS) addresses these challenges by cleanly disentangling control and data flows, easing management, and improving control functionality of conventional storage systems. Despite its momentum in the research community, many aspects of the paradigm are still unclear, undefined, and unexplored, leading to misunderstandings that hamper the research and development of novel SDS technologies. In this article, we present an in-depth study of SDS systems, providing a thorough description and categorization of each plane of functionality. Further, we propose a taxonomy and classification of existing SDS solutions according to different criteria. Finally, we provide key insights about the paradigm and discuss potential future research directions for the field.

CloseRead Abstract

2020

InDubio: A Combinator Library to Disambiguate Ambiguous Grammars

Authors
Macedo, JN; Saraiva, J;

Publication
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2020, PART IV

Abstract
To infer an abstract model from source code is one of the main tasks of most software quality analysis methods. Such abstract model is called Abstract Syntax Tree and the inference task is called parsing. A parser is usually generated from a grammar specification of a (programming) language and it converts source code of that language into said abstract tree representation. Then, several techniques traverse this tree to assess the quality of the code (for example by computing source code metrics), or by building new data structures (e.g, flow graphs) to perform further analysis (such as, code cloning, dead code, etc). Parsing is a well established technique. In recent years, however, modern languages are inherently ambiguous which can only be fully handled by ambiguous grammars. In this setting disambiguation rules, which are usually included as part of the grammar specification of the ambiguous language, need to be defined. This approach has a severe limitation: disambiguation rules are not first class citizens. Parser generators offer a small set of rules that can not be extended or changed. Thus, grammar writers are not able to manipulate nor define a new specific rule that the language he is considering requires. In this paper we present a tool, name InDubio, that consists of an extensible combinator library of disambiguation filters together with a generalized parser generator for ambiguous grammars. InDubio defines a set of basic disambiguation rules as abstract syntax tree filters that can be combined into more powerful rules. Moreover, the filters are independent of the parser generator and parsing technology, and consequently, they can be easily extended and manipulated. This paper presents InDubio in detail and also presents our first experimental results.

CloseRead Abstract

2020

Cross-Sensor Quality Assurance for Marine Observatories

Authors
Diamant, R; Shachar, I; Makovsky, Y; Ferreira, BM; Cruz, NA;

Publication
REMOTE SENSING

Abstract
Measuring and forecasting changes in coastal and deep-water ecosystems and climates requires sustained long-term measurements from marine observation systems. One of the key considerations in analyzing data from marine observatories is quality assurance (QA). The data acquired by these infrastructures accumulates into Giga and Terabytes per year, necessitating an accurate automatic identification of false samples. A particular challenge in the QA of oceanographic datasets is the avoidance of disqualification of data samples that, while appearing as outliers, actually represent real short-term phenomena, that are of importance. In this paper, we present a novel cross-sensor QA approach that validates the disqualification decision of a data sample from an examined dataset by comparing it to samples from related datasets. This group of related datasets is chosen so as to reflect upon the same oceanographic phenomena that enable some prediction of the examined dataset. In our approach, a disqualification is validated if the detected anomaly is present only in the examined dataset, but not in its related datasets. Results for a surface water temperature dataset recorded by our Texas A&M-Haifa Eastern Mediterranean Marine Observatory (THEMO)-over a period of 7 months, show an improved trade-off between accurate and false disqualification rates when compared to two standard benchmark schemes.

CloseRead Abstract

2020

Experimenting with Liveness in Cloud Infrastructure Management

Authors
Lourenco, P; Dias, JP; Aguiar, A; Ferreira, HS; Restivo, A;

Publication
EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING

Abstract
Cloud computing has been playing a significant role in the provisioning of services over the Internet since its birth. However, developers still face several challenges limiting its full potential. The difficulties are mostly due to the large, ever-growing, and ever-changing catalog of services offered by cloud providers. As a consequence, developers must deal with different cloud services in their systems; each managed almost individually and continually growing in complexity. This heterogeneity may limit the view developers have over their system architectures and make the task of managing these resources more complex. This work explores the use of liveness as a way to shorten the feedback loop between developers and their systems in an interactive and immersive way, as they develop and integrate cloud-based systems. The designed approach allows real-time visualization of cloud infrastructures using a visual city metaphor. To assert the viability of this approach, the authors conceived a proof-of-concept and carried on experiments with developers to assess its feasibility.

CloseRead Abstract

2020

Job Scheduling in Fog Paradigm - A Proposal of Context-aware Task Scheduling Algorithms

Authors
Barros, C; Rocio, V; Sousa, A; Paredes, H;

Publication
2020 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY SYSTEMS AND INNOVATION (ICITSI)

Abstract
According to the author's knowledge task scheduling in fog paradigm is highly complex and in the literature there are still few studies on it. In the cloud architecture, it is widely studied and in many researches, it is approached from the perspective of service providers. Trying to bring innovative contributions in these areas, in this paper, we propose a solution to the context-aware task-scheduling problem for fog paradigm. In our proposal, different context parameters are normalized through Min Max normalization, requisition priorities are defined through the application of the Multiple Linear Regression (MLR) technique and scheduling is performed using Multi-Objective Non-Linear Programming Optimization (MONLIP) technique.

CloseRead Abstract

2020

Role of Content Analysis in Improving the Curation of Experimental Data

Authors
Aguiar Castro, JD; Landeira, C; da Silva, JR; Ribeiro, C;

Publication
Int. J. Digit. Curation

Abstract
As researchers are increasingly seeking tools and specialized support to perform research data management activities, the collaboration with data curators can be fruitful. Yet, establishing a timely collaboration between researchers and data curators, grounded in sound communication, is often demanding. In this paper we propose manual content analysis as an approach to streamline the data curator workflow. With content analysis curators can obtain domain-specific concepts used to describe experimental configurations in scientific publications, to make it easier for researchers to understand the notion of metadata and for the development of metadata tools. We present three case studies from experimental domains, one related to sustainable chemistry, one to photovoltaic generation and another to nanoparticle synthesis. The curator started by performing content analysis in research publications, proceeded to create a metadata template based on the extracted concepts, and then interacted with researchers. The approach was validated by the researchers with a high rate of accepted concepts, 84 per cent. Researchers also provide feedback on how to improve some proposed descriptors. Content analysis has the potential to be a practical, proactive task, which can be extended to multiple experimental domains and bridge the communication gap between curators and researchers. [This paper is a conference pre-print presented at IDCC 2020 after lightweight peer review.]

CloseRead Abstract