Publications

Publications by Gabriel David

2013

The impact of time in link-based Web ranking

Authors
Nunes, S; Ribeiro, C; David, G;

Publication
INFORMATION RESEARCH-AN INTERNATIONAL ELECTRONIC JOURNAL

Abstract
Introduction. The strong dynamic nature of the Web is a well-known reality. Nonetheless, research on Web dynamics is still a minor part of mainstream Web research. This is largely the case in Web link analysis. In this paper we investigate and measure the impact of time in link-based ranking algorithms on a particular subset of the Web, specifically blogs. Method. Using a large collection of blog posts that span more than three years, we compare a traditional link-based ranking algorithm with a time-biased alternative, providing some insights into the evolution of link data over time. We designed two experiments to evaluate the use of temporal features in authority estimation algorithms. In the first experiment we compare time-independent and time-sensitive ranking algorithms with a reference rank based on the total number of visits to each blog. In the second, we use feedback from communication media domain experts to contrast different rankings of Portuguese news Websites. Results. The distribution of citations to a Web document over time contains valuable information. Based on several examples we show that time-independent algorithms are unable to capture the correct popularity of sites with high citation activity. Using a reference rank based on the number of visits to a site, we show that a time-biased approach has a better performance. Conclusions. Although both time-independent and time-aware approaches are based on the same raw data, the experiments indicate that they can be treated as complementary signals for relevance assessment by information retrieval systems. We show that temporal information present in blogs can be used to derive stable time-dependent features, which can be successfully used in the context of Web document ranking.

CloseRead Abstract

2013

Preservation of Data Warehouses: Extending the SIARD System with DWXML Language and Tools

Authors
Aldeias, C; David, G; Ribeiro, C;

Publication
INNOVATIONS IN XML APPLICATIONS AND METADATA MANAGEMENT: ADVANCING TECHNOLOGIES

Abstract
Data warehouses are used in many application domains, and there is no established method for their preservation. A data warehouse can be implemented in multidimensional structures or in relational databases that represent the dimensional model concepts in the relational model. The focus of this work is on describing the dimensional model of a data warehouse and migrating it to an XML model, in order to achieve a long-term preservation format. This chapter presents the definition of the XML structure that extends the SIARD format used for the description and archive of relational databases, enriching it with a layer of metadata for the data warehouse components. Data Warehouse Extensible Markup Language (DWXML) is the XML language proposed to describe the data warehouse. An application that combines the SIARD format and the DWXML metadata layer supports the XML language and helps to acquire the relevant metadata for the warehouse and to build the archival format. Copyright (C) 2013, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

CloseRead Abstract

2013

SIARD archive browser - The components

Authors
Rahman, AU; David, G; Ribeiro, C;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
The Software-Independent Archival of Relational Databases (SIARD) project developed a tool known as the "SIARD Suite" for preserving relational databases. The tool converts a relational database to a XML format. This paper presents the components of the SIARD Archive Browser which is a simple to use and platform-independent tool for browsing a SIARD Archive. This may be helpful for users interested in using the software. Moreover, it may be useful for people who want to re-use the code and develop software for browsing a SIARD archive with more functionality. © Springer International Publishing 2013.

CloseRead Abstract

2014

ViBest SHM: an information system and data repository for structural health monitoring

Authors
da Costa, FP; Cunha, A; David, G;

Publication
EURODYN 2014: IX INTERNATIONAL CONFERENCE ON STRUCTURAL DYNAMICS

Abstract
This project has been motivated by the need to standardize, preserve, and share the data sets of the Laboratory of Vibrations and Structural Monitoring (ViBest, www.fe.up.pt/vibest) of FEUP, produced by several long term projects individually managed. The solution presented is meant to support the process of Structural Health Monitoring, offering features to catalogue the projects, their goals and components, to store and visualize their acquired and processed data through time, and to preserve the data in a standardized form for all the research unit and extensible to future applications. The result is a digital archive with automatic ingestion of new data files and a Web interface with access control and tools for information management. There is a batch export functionality to deal with large data transfers. It is being used on monitoring data related with different kinds of structural health monitoring applications. The standardization and preservation of all data sets acquired in multiple applications will be certainly a solid basis for further research, either at a local basis or in the context of international joint cooperation.

CloseRead Abstract

2013

Definition of a retrospective health information policy based on (re)use study

Authors
Goncalves, F; David, G;

Publication
Handbook of Research on ICTs and Management Systems for Improving Efficiency in Healthcare and Social Care

Abstract
Medical information produced in hospitals is, simultaneously, used (1) to support health care provided to patients, (2) in research work performed by internal and external health professionals, and (3) as legal proof with various objectives. The co-existence of electronic and paper health information, the integration constraints of the various computer applications, and the storage of massive volumes of retrospective paper-based patient records are dominant concerns for São João Hospital Center (SJHC). These problems must be considered in the adoption of an Electronic Patient Record (EPR) in order to ensure that hospitals and patients fully benefit from the technological investments. The contribution of this chapter is the design and conduction of a (re)use study, which consisted of an analysis of the paper-based records management activities and of the patients' records content. A survey on the (re) use of the paper-based patient records has been conducted in order to characterize the (re)use in terms of objective and type of hospital encounter, and documents accessed were identified and organized in an access frequency table. The results support the paper-based patient records strategy to implement in SJHC integrated in the Hospital EPR adoption project. © 2013, IGI Global.

CloseRead Abstract

2015

Database Preservation: The DBPreserve Approach

Authors
Rahman, AU; Muzammal, M; David, G; Ribeiro, C;

Publication
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS

Abstract
In many institutions relational databases are used as a tool for managing information related to day to day activities. Institutions may be required to keep the information stored in relational databases accessible because of many reasons including legal requirements and institutional policies. However, the evolution in technology and change in users with the passage of time put the information stored in relational databases in danger. In the long term the information may become inaccessible when the operating system, database management system or the application software is not available any more or the contextual information not stored in the database may be lost thus affecting the authenticity and understandability of the information. This paper presents an approach for preserving relational databases for the long-term. The proposal involves migrating a relational database to a dimensional model which is simple to understand and easy to write queries against. Practical transformation rules are developed by carrying out multiple case studies. One of the case studies is presented as a running example in the paper. Systematic implementation of the rules ensures no loss of information in the process except for the unwanted details. The database preserved using the approach is converted to an open format but may be reloaded to a database management system in the long-term.

CloseRead Abstract