2025
Autores
Pires, C; Nunes, S; Teixeira, LF;
Publicação
CoRR
Abstract
2025
Autores
Maia, HC; Ariel, P; Nunes, S;
Publicação
AI Ethics
Abstract
2025
Autores
Santos, T; Bispo, J; Cardoso, JMP; Hoe, JC;
Publicação
2025 IEEE 33RD ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, FCCM
Abstract
Heterogeneous CPU-FPGA C/C++ applications may rely on High-level Synthesis (HLS) tools to generate hardware for critical code regions. As typical HLS tools have several restrictions in terms of supported language features, to increase the size and variety of offloaded regions, we propose several code transformations to improve synthesizability. Such code transformations include: struct and array flattening; moving dynamic memory allocations out of a region; transforming dynamic memory allocations into static; and asynchronously executing host functions, e.g., printf(). We evaluate the impact of these transformations on code region size using three realworld applications whose critical regions are limited by nonsynthesizable C/C++ language features.
2025
Autores
Santos, T; Bispo, J; Cardoso, JMP;
Publicação
2025 IEEE 33RD ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, FCCM
Abstract
Critical performance regions of software applications are often accelerated by offloading them onto an FPGA. An efficient end result requires the judicious application of two processes: hardware/software (hw/sw) partitioning, which identifies the regions for offloading, and the optimization of those regions for efficient High-level Synthesis (HLS). Both processes are commonly applied separately, not relying on any potential interplay between them, and not revealing how the decisions made in one process could positively influence the other. This paper describes our primary efforts and contributions made so far, and our work-in-progress, in an approach that combines both hw/sw partitioning and optimization into a unified, holistic process, automated using source-to-source compilation. By using an Extended Task Graph (ETG) representation of a C/C++ application, and expanding the synthesizable code regions, our approach aims at creating clusters of tasks for offloading by a) maximizing the potential optimizations applied to the cluster, b) minimizing the global communication cost, and c) grouping tasks that share data in the same cluster.
2025
Autores
Cardoso, JMP; Najjar, WA;
Publicação
ARC
Abstract
The International Symposium on Applied Reconfigurable Computing (ARC) is an annual forum for the discussion and dissemination of research, notably applying the Reconfigurable Computing (RC) concept to real-world problems. The first edition of ARC took place in 2005, and in 2024, ARC celebrated its 20th edition. During those 20 years, the field of reconfigurable computing saw a tremendous growth in its underlying technology. ARC contributed very significantly to the presentation and dissemination of new ideas, innovative applications, and fruitful discussions, all of which have resulted in the shaping of novel lines of research. Here, we present selected papers from the first 20 years of ARC, that we believe represent the corpus of work and reflect the ARC spirit by covering a broad spectrum of RC applications, benchmarks, tools, and architectures.
2025
Autores
Santos, T; Bispo, J; Cardoso, JMP; Hoe, JC;
Publicação
MCSoC
Abstract
On a CPU-FPGA system, C/C++ applications are typically accelerated by offloading specific code regions onto the FPGA using High-level Synthesis (HLS). Although modern FPGAs can implement increasingly large and complex designs, the size and variety of potential offloading code regions remain constrained by the limitations of HLS tools (e.g., no support for dynamic memory allocation and system calls). This paper proposes automated C/C++ source-to-source transformations that tackle these limitations in two steps. Firstly, transformations reduce the entropy of an input C/C++ application by converting it into a subset of C, e.g., by flattening arrays and structs. Secondly, additional transformations make a selected code region synthesizable, e.g., by moving dynamic memory allocations out of the region, converting them to static memory, and offloading non-synthesizable C standard library calls, such as printf(), to the CPU. We evaluate the impact of these transformations showing results obtained through Vitis HLS for four real-world examples: the disparity and texture-synthesis benchmarks from CortexSuite, which contain dynamic memory allocations and indirect pointers in their hotspots; llama2, a Large Language Model that calls printf() every time it predicts a new word; and the spam-filter benchmark from Rosetta, as a debugging showcase.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.