Publications

Publications by João Paiva Cardoso

2011

Introduction

Authors
Cardoso, JMP; Hübner, M;

Publication
Reconfigurable Computing

Abstract

2011

Reconfigurable Computing

Authors
Cardoso, JMP; Hübner, M;

Publication

Abstract

2001

Compilation increasing the scheduling scope for multi-memory-FPGA-based custom computing machines

Authors
Cardoso, JMP; Neto, HC;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
This paper presents new achievements on the automatic mapping of abstract algorithms, written in imperative software programming languages, to custom computing machines. The reconfigurable hardware element of the target architecture consists of one field-programmable gate array coupled with one or more memories. The compilation flow exposes operation- and functional-level parallelism, and speculative execution. Such expositions are efficiently represented in a hierarchical model. In order to take full advantage of such representation, the scheduling scope is significantly improved by merging basic blocks at loop boundaries and by considering the parallel execution of exposed concurrent loops. The paper describes the scheduling technique, shows a study on the impact of the merge operation, and reveals the improvements achieved when the exposed parallelism is fully satisfied. © Springer-Verlag Berlin Heidelberg 2001.

CloseRead Abstract

2009

On Simplifying Placement and Routing by Extending Coarse-Grained Reconfigurable Arrays with Omega Networks

Authors
Ferreira, R; Damiany, A; Vendramini, J; Teixeira, T; Cardoso, JMP;

Publication
RECONFIGURABLE COMPUTING: ARCHITECTURES, TOOLS AND APPLICATIONS

Abstract
Most reconfigurable computing architectures suffer from computationally demanding Placement and Routing (P&R) steps which might hamper their use in contexts requiring dynamic compilation (e.g., to guarantee application portability in embedded systems). Bearing in mind the simplification of P&R steps, this paper presents and analyzes a coarse-grained reconfigurable array extended with global Omega Networks. We show that integrating one or two Omega Networks in a coarse-grained array simplifies the P&R stage with both low hardware resource overhead and low performance degradation (18% for an 8 x 8 array). The experimental results included permit to compare the coarse-grained array with one or two Omega Networks with a coarse-grained array based on a grid of processing elements with neighbor connections. When comparing the execution time to perform the P&R stage needed for the two arrays, we show that the array using two Omega Networks needs a far simple P&R which for the benchmarks used completed on average in about 20x less time.

CloseRead Abstract

2011

Fast placement and routing by extending coarse-grained reconfigurable arrays with Omega Networks

Authors
Ferreira, RS; Cardoso, JMP; Damiany, A; Vendramini, J; Teixeira, T;

Publication
JOURNAL OF SYSTEMS ARCHITECTURE

Abstract
Reconfigurable computing architectures are commonly used for accelerating applications and/or for achieving energy savings. However, most reconfigurable computing architectures suffer from computationally demanding placement and routing (P&R) steps. This problem may disable their use in systems requiring dynamic compilation (e.g., to guarantee application portability in embedded systems). Bearing in mind the simplification of P&R steps, this paper presents and analyzes a coarse-grained reconfigurable array (CGRA) extended with global multistage interconnect networks, specifically Omega Networks. We show that integrating one or two Omega Networks in a CGRA permits to simplify the P&R stage resulting in both low hardware resource overhead and low performance degradation (18% for an 8 x 8 array). We compare the proposed CGRA, which integrates one or two Omega Networks, with a CGRA based on a grid of processing elements with reach neighbor interconnections and with a torus topology. The execution time needed to perform the P&R stage for the two array architectures shows that the array using two Omega Networks needs a far simpler and faster P&R. The P&R stage in our approach completed on average in about 16 x less time for the 17 benchmarks used. Similar fast approaches needed CGRAs with more complex interconnect resources in order to allow most of the benchmarks used to be successfully placed and routed.

CloseRead Abstract

2011

Selected papers from the 17th reconfigurable architectures workshop (RAW2010)

Authors
Dasu, A; Cardoso, JMP; Bozorgzadeh, E; Becker, J;

Publication
International Journal of Reconfigurable Computing

Abstract