Publicacoes - INESC TEC

Publicações

Publicações por CSE

2014

Distributed database system of the New Atlas of Amphibians and Reptiles in Europe: the NA2RE project

Autores
Sillero, N; Oliveira, MA; Sousa, P; Sousa, F; Goncalves Seco, L;

Publicação
AMPHIBIA-REPTILIA

Abstract
The Societas Europaea Herpetologica (SEH) decided in 2006 through its Mapping Committee to implement the New Atlas of Amphibians and Reptiles of Europe (NA2RE: http://na2re.ismai.pt) as a chorological database system. Initially designed to be a system of distributed databases, NA2RE quickly evolved to a Spatial Data Infrastructure, a system of geographically distributed systems. Each individual system has a national focus and is implemented in an online network, accessible through standard interfaces, thus allowing for interoperable communication and sharing of spatial-temporal data amongst one another. A Web interface facilitates the access of the user to all participating data systems as if it were one single virtual integrated data-source. Upon user request, the Web interface searches all distributed data-sources for the requested data, integrating the answers in an always updated and interactive map. This infrastructure implements methods for fast actualisation of national observation records, as well as for the use of a common taxonomy and systematics. Using this approach, data duplication is avoided, national systems are maintained in their own countries, and national organisations are responsible for their own data curation and management. The database could be built with different representation levels and resolution levels of data, and filtered according to species conservation matters. We present the first prototype of NA2RE, composed of the last data compilation performed by the SEH (Sillero et al., 2014). This system is implemented using only open source software: PostgreSQL database with PostGIS extension, Geoserver, and OpenLayers.

FecharLer Abstract

2014

Improvements to Efficient Retrieval of Very Large Temporal Datasets with the TravelLight Method

Autores
de Carvalho, AV; Oliveira, MA; Rocha, A;

Publicação
PROCEEDINGS OF THE 2014 9TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI 2014)

Abstract
A considerable number of domains deal with large and complex volumes of temporal data. The management of these volumes, from capture, storage, search, transfer, analysis and visualization, still provides interesting challenges. One critical task is the efficient retrieval of data (raw data or intermediate results from analytic tools). Previous work proposed the TravelLight method which reduced the turnaround time and improved interactive retrieval of data from large temporal datasets by exploring the temporal consistency of records in a database. In this work we propose improvements to the method by adopting a new paradigm focused in the management of time intervals instead of solely in data items. A major advantage of this paradigm shift is to enable the separation of the method implementation from any particular temporal data source, as it is autonomous and efficient in the management of retrieved data. Our work demonstrates that the overheads introduced by the new paradigm are smaller than prior overall overheads, further reducing the turnaround time. Reported results concern experiments with a temporally linear navigation across two datasets of one million items. With the obtained results it is possible to conclude that the improvements presented in this work further reduce turnaround time thus enhancing the response of interactive tasks over very large temporal datasets.

FecharLer Abstract

2013

Ensemble - an E-Learning Framework

Autores
Queiros, R; Leal, JP;

Publicação
JOURNAL OF UNIVERSAL COMPUTER SCIENCE

Abstract
E-Learning frameworks are conceptual tools to organize networks of e-learning services. Most frameworks cover areas that go beyond the scope of e-learning, from course to financial management, and neglects the typical activities in everyday life of teachers and students at schools such as the creation, delivery, resolution and evaluation of assignments. This paper presents the Ensemble framework - an e-learning framework exclusively focused on the teaching-learning process through the coordination of pedagogical services. The framework presents an abstract data, integration and evaluation model based on content and communications specifications. These specifications must base the implementation of networks in specialized domains with complex evaluations. In this paper we specialize the framework for two domains with complex evaluation: computer programming and computer-aided design (CAD). For each domain we highlight two Ensemble hotspots: data and evaluations procedures. In the former we formally describe the exercise and present possible extensions. In the latter, we describe the automatic evaluation procedures.

FecharLer Abstract

2013

Making Programming Exercises Interoperable with PExIL

Autores
Queiros, R; Leal, JP;

Publicação
INNOVATIONS IN XML APPLICATIONS AND METADATA MANAGEMENT: ADVANCING TECHNOLOGIES

Abstract
Several standards have appeared in recent years to formalize the metadata of learning objects, but they are still insufficient to fully describe a specialized domain. In particular, the programming exercise domain requires interdependent resources (e. g. test cases, solution programs, exercise description) usually processed by different services in the programming exercise lifecycle. Moreover, the manual creation of these resources is time-consuming and error-prone, leading to an obstacle to the fast development of programming exercises of good quality. This chapter focuses on the definition of an XML dialect called PExIL (Programming Exercises Interoperability Language). The aim of PExIL is to consolidate all the data required in the programming exercise lifecycle from when it is created to when it is graded, covering also the resolution, the evaluation, and the feedback. The authors introduce the XML Schema used to formalize the relevant data of the programming exercise lifecycle. The validation of this approach is made through the evaluation of the usefulness and expressiveness of the PExIL definition. In the former, the authors present the tools that consume the PExIL definition to automatically generate the specialized resources. In the latter, they use the PExIL definition to capture all the constraints of a set of programming exercises stored in a learning objects repository. Copyright (C) 2013, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

FecharLer Abstract

2013

Towards an accurate evaluation of deduplicated storage systems

Autores
Paulo, J; Reis, P; Pereira, J; Sousa, A;

Publicação
COMPUTER SYSTEMS SCIENCE AND ENGINEERING

Abstract
Deduplication has proven to be a valuable technique for eliminating duplicate data in backup and archival systems and is now being applied to new storage environments with distinct requirements and performance trade-offs. Namely, deduplication system are now targeting large-scale cloud computing storage infrastructures holding unprecedented data volumes with a significant share of duplicate content. It is however hard to assess the usefulness of deduplication in particular settings and what techniques provide the best results. In fact, existing disk I/O benchmarks follow simplistic approaches for generating data content leading to unrealistic amounts of duplicates that do not evaluate deduplication systems accurately. Moreover, deduplication systems are now targeting heterogeneous storage environments, with specific duplication ratios, that benchmarks must also simulate. We address these issues with DEDISbench, a novel micro-benchmark for evaluating disk I/O performance of block based deduplication systems. As the main contribution, DEDISbench generates content by following realistic duplicate content distributions extracted from real datasets. Then, as a second contribution, we analyze and extract the duplicates found on three real storage systems, proving that DEDISbench can easily simulate several workloads. The usefulness of DEDISbench is shown by comparing it with Bonnie++ and IOzone open-source disk I/O micro-benchmarks on assessing two open-source deduplication systems, Opendedup and Lessfs, using Ext4 as a baseline. Our results lead to novel insight on the performance of these file systems.

FecharLer Abstract

2013

Retrieval of Very Large Temporal Datasets for Interactive Tasks

Autores
de Carvalho, AV; Oliveira, MA; Rocha, A;

Publicação
PROCEEDINGS OF THE 2013 8TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI 2013)

Abstract
Many tasks dealing with temporal data, such as interactive browse through temporal datasets, require intensive retrieval from the database. Depending on the user's task, the data retrieved may be too large to fit in the local memory. Even if it fits, the time taken to retrieve the data may compromise user interaction. This work proposes a method, TravelLight, which improves interactive traveling across very large temporal datasets by exploring the temporal consistency of data items. The proposed method consists of two algorithms: the data retrieval and the memory management algorithm, both contributing to improve memory usage and, most important, to reduce the turnaround time. Results are reported concerning experiments with a temporally linear navigation across two datasets of one million items, which differ in the average time span of items. With the obtained results it is possible to conclude that the proposed method reduces turnaround time thus enhancing the response of interactive tasks over very large temporal datasets.

FecharLer Abstract