Publications

Publications by Vítor Santos Costa

2007

An integrated approach to feature invention and model construction for drug activity prediction

Authors
Davis, J; Costa, VS; Ray, S; Page, D;

Publication
ACM International Conference Proceeding Series

Abstract
We present a new machine learning approach for 3D-QSAR, the task of predicting binding affinities of molecules to target proteins based on 3D structure. Our approach predicts binding affinity by using regression on substructures discovered by relational learning. We make two contributions to the state-of-the-art. First, we use multiple-instance (MI) regression, which represents a molecule as a set of 3D conformations, to model activity. Second, the relational learning component employs the "Score As You Use" (SAYU) method to select substructures for their ability to improve the regression model. This is the first application of SAYU to multiple-instance, real-valued prediction. We evaluate our approach on three tasks and demonstrate that (i) SAYU outperforms standard coverage measures when selecting features for regression, (ii) the MI representation improves accuracy over standard single feature-vector encodings and (iii) combining SAYU with MI regression is more accurate for 3D-QSAR than either approach by itself.

CloseRead Abstract

2012

Scheduling cyclic task graphs with SCC-map

Authors
Sardinha, A; Alves, TAO; Marzulo, LAJ; Franca, FMG; Barbosa, VC; Costa, VS;

Publication
Proceedings - 3rd Workshop on Applications for Multi-Core Architecture, WAMCA 2012

Abstract
The Dataflow execution model has been shown to be a good way of exploiting TLP, making parallel programming easier. In this model, tasks must be mapped to processing elements (PEs) considering the trade-off between communication and parallelism. Previous work on scheduling dependency graphs have mostly focused on directed a cyclic graphs, which are not suitable for dataflow (loops in the code become cycles in the graph). Thus, we present the SCC-Map: a novel static mapping algorithm that considers the importance of cycles during the mapping process. To validate our approach, we ran a set of benchmarks in on our dataflow simulator varying the communication latency, the number of PEs in the system and the placement algorithm. Our results show that the benchmark programs run significantly faster when mapped with SCC-Map. Moreover, we observed that SCC-Map is more effective than the other mapping algorithms when communication latency is higher. © 2012 IEEE.

CloseRead Abstract

2010

TALM: A hybrid execution model with distributed speculation support

Authors
Marzulo, LAJ; Alves, TAO; Franc, FMG; Costa, VS;

Publication
Proceedings - 22nd International Symposium on Computer Architecture and High Performance Computing Workshops, SBAC-PADW 2010, 1st Workshop on Applications for Multi and Many Core Architectures, WAMMCA

Abstract
Parallel programming has become mandatory to fully exploit the potential of modern CPUs. The data-flow model provides a natural way to exploit parallelism. However, traditional data-flow programming is not trivial: specifying dependencies and control using fine-grained tasks (such as instructions) can be complex and present unwanted overheads. To address this issue we have built a coarse-grained data-flow model with speculative execution support to be used on top of widespread architectures, implemented as a hybrid Von Neumanm/data-flow execution system. We argue that speculative execution fits naturally with the data-flow model. Using speculative execution liberates the programmer to consider only the main dependencies, and still allows correct data-flow execution of coarse-grained tasks. More- over, our speculation mechanism does not demand centralised control, which is a key feature for upcoming many-core systems, where scalability has become an important concern. An initial study on a artificial bank server application suggests that there is a wide range of scenarios where speculation can be very effective. © 2010 IEEE,.

CloseRead Abstract

2005

ReGS: User-level reliability in a grid environment

Authors
Sanches, JAL; Vargas, PK; De Dutra, IC; Costa, VS; Geyer, CFR;

Publication
2005 IEEE International Symposium on Cluster Computing and the Grid, CCGrid 2005

Abstract
Grid environments are ideal for executing applications that require a huge amount of computational work, both due to the big number of tasks to execute and to the large amount of data to be analysed. Unfortunately, current tools may require that users deal themselves with corrupted outputs or early termination of tasks. This becomes incovenient as the number of parallel runs grows to easily exceed the thousands. ReCS is a user-level software designed to provide automatic detection and restart of corrupted or early terminated tasks. ReGS uses a web interface to allow the setup and control of grid execution, and provides automatic input data setup. ReGS allows the automatic detection of job dependencies, through the GRID-ADL task management language. Our results show that besides automatically and effectively managing a huge number of tasks in grid environments, ReGS is also a good monitoring tool to spot grid nodes pitfalls. © 2005 IEEE.

CloseRead Abstract

2000

IAP for dummies: The YAP design

Authors
Eduardo Correia, M; Santos Costa, V;

Publication
Electronic Notes in Theoretical Computer Science

Abstract
One of the advantages of logic programming is the fact that it offers several sources of implicit parallelism. One particularly interesting form of And-Parallelism is Independent And-Parallelism (IAP). Most work on the implementation of IAP is based on Hermenegildo's RAP-WAM. Unfortunately there are some drawbacks associated with the classical approaches based on the use of parcalls and markers. One first observation is that the introduction of parcall frames significantly slows down sequential execution. Moreover, it may result in fine-grained parallel work. We found these problems to be particularly significant in the context of the implementation of combined AND/OR systems. In this paper we take a fresh look at this issue. Our goal is to start from a standard sequential Prolog implementation and try to discover the minimal number of changes that would be required for an efficient implementation of IAP. The key ideas in our design are to (i) to always take advantage of analogy between or-parallelism and IAP; (ii) to avoid creating new structures by adapting preexistingx WAM data-structures wherever possible; and (iii) to avoid major changes to the compiler. The authors would like to acknowledge and thank the contribution and support from Fernando Silva. The work has also benefitted from discussions with Luis Fernando Castro, Ines de Castro Dutra, Kish Shen, Gopal Gupta, and Enrico Pontelli. Our work has been partly supported by Fundaçã da Ciencia e Tecnologia and JNICT under the projects Melodia (JNICT/PBIC/C/TIT/2495/95) and Dolphin (PRAXIS/2/2.l/TIT/1577/95). © 2000 Published by Elsevier B.V.

CloseRead Abstract

1993

And-Or parallel Prolog: A recomputation based approach

Authors
Gupta, G; Hermenegildo, MV; Costa, VS;

Publication
New Generation Computing

Abstract
We argue that in order to exploit both Independent And-and Or-parallelism in Prolog programs there is advantage in recomputing some of the independent goals, as opposed to all their solutions being reused. We present an abstract model, called the Composition-tree, for representing and-or parallelism in Prolog programs. The Composition-tree closely mirrors sequential Prolog execution by recomputing some independent goals rather than fully re-using them. We also outline two environment representation techniques for And-Or parallel execution of full Prolog based on the Composition-tree model abstraction. We argue that these techniques have advantages over earlier proposals for exploiting and-or parallelism in Prolog. © 1993 Ohmsha, Ltd. and Springer.

CloseRead Abstract