Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por HumanISE

2007

Aggressive loop pipelining for reconfigurable architectures

Autores
Menotti, R; Marques, E; Cardoso, JMP;

Publicação
2007 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS, PROCEEDINGS, VOLS 1 AND 2

Abstract

2007

On adapting power estimation models for embedded soft-core processors

Autores
de Holanda, JA; Assumpcao, J; Wolf, DE; Marques, E; Cardoso, JMP;

Publicação
2007 INTERNATIONAL SYMPOSIUM ON INDUSTRIAL EMBEDDED SYSTEMS

Abstract
The increasing use of battery-powered embedded systems has motivated the development of power consumption models in order to help designers to build low-power systems. Due to the configurability features of FPGAs, the adoption of systems containing one or more soft-core processors on a single chip is becoming more and more attractive. This paper presents an adaptation of the instruction-level power estimation model to soft-core processors implemented in FPGAs. This model allowed to estimate the power dissipated in eleven test applications with a maximum error of 4.78%. The Ongoing work includes efforts towards a software power estimation model for multi-core systems embedded in a single FPGA device.

2007

An FPGA implementation for a kalman filter with application to mobile robotics

Autores
Bonato, V; Peron, R; Wolf, DF; de Holanda, JAM; Marques, E; Cardoso, JMP;

Publicação
2007 INTERNATIONAL SYMPOSIUM ON INDUSTRIAL EMBEDDED SYSTEMS

Abstract
The problem of simultaneous localization and mapping has been studied by the mobile robotics scientific community over the last two decades. Most solutions for this problem are based on probabilistic theory in order to represent the uncertainty in robot perception and action. One of the most efficient probabilistic methods is the Extended Kalman Filter (EKF). However, the EKF demands a considerable amount of computing power and is usually processed by high-end laptops coupled to the robots. In this work, we present an implementation of the EKF targeting an embedded system based on an FPGA device. In order to improve performance, our approach combines a softcore processor with customized hardware. We present experiments with four different FPGA implementations, being the first purely based on software, the second using custom instruction logic directly connected to the processor's ALU, the third using hardware accelerators connected to the processor's data bus, and finally the fourth combining those two hardware/software solutions. For the experiments conducted, the results obtained with a small addition of hardware resources permitted to increase from 2x to 4x the performance of the global system.

2007

A polynomial placement algorithm for data driven coarse-grained reconfigurable architectures

Autores
Ferreira, R; Garcia, A; Teixeira, T; Cardoso, JMP;

Publicação
IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI, PROCEEDINGS: EMERGING VLSI TECHNOLOGIES AND ARCHITECTURES

Abstract
Coarse-grained reconfigurable computing architectures vary widely in the number and characteristics of the processing elements (cells) and routing topologies used. In order to exploit several different topologies, a place and route framework, able to deal with such vast design exploration space, is of paramount importance. Bearing this in mind, this paper proposes a placement scheme able to target different topologies when considering data-driven reconfigurable architectures. Our approach uses graph models for the target architecture and for the dataflow representation of the application being mapped. Our placement algorithm is guided by a Depth-First Traversal in both the architecture and the application graphs. Two versions of the placement algorithm with respectively O(e) and O(e + n(3)) computational complexities are presented, where e is the number of edges in the dataflow representation of the application and n is the number of cells in the graph model of the architecture. The achieved experimental results show that our approach can be useful to exploit different interconnect topologies as far as coarse-grained reconfigurable computing architectures are concerned.

2007

Using Rewriting Logic to Match Patterns of Instructions from a Compiler Intermediate Form to Coarse-Grained Processing Elements

Autores
Morra, C; Cardoso, JMP; Becker, J;

Publicação
21th International Parallel and Distributed Processing Symposium (IPDPS 2007), Proceedings, 26-30 March 2007, Long Beach, California, USA

Abstract
This paper presents a new and retargetable method to identify patterns of instructions with direct support in coarsegrained processing elements (PEs). The method uses a three-address code SSA (static single assignment) representation of the kernel being mapped and Rewriting Logic for template matching and algebraic optimizations. This approach is able to identify sets of SSA instructions that can be mapped to different PE complexities available in coarsegrained reconfigurable computing architectures. As a proof of concept, results of the approach with a number of benchmark kernels, as far as coverage of template instructions is concerned, are included. © 2007 IEEE.

2007

A data-driven approach for pipelining sequences of data-dependent loops

Autores
Rodrigues, R; Cardoso, JMP; Diniz, PC;

Publicação
FCCM 2007: 15TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS

Abstract
Many video and image/signal processing applications can be structured as sequences of data-dependent tasks using a consumer/producer communication paradigm and are therefore amenable to pipelined execution. This paper presents an execution technique to speed-up the overall execution of successive, data-dependent tasks on a reconfigurahle architecture. The technique pipelines sequences of data-dependent tasks by overlapping their execution subject to data-dependences. It decouples the concurrent data-path and control units and uses a custom, application data-driven, fine-grained synchronization and buffering scheme. In addition, the execution scheme allows for out of-order, but data-dependent producer-consumer pairs not allowed by previous data-driven pipelining approaches. The approach has been exploited in the context of a high-level compiler targeting FPGAs. The preliminary experimental results reveal noticeable performance improvements and buffer size reductions for a number of benchmarks over traditional approaches.

  • 603
  • 641