Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
About

About

João M. P. Cardoso received his PhD degree in Electrical and Computer Engineering from the IST/UTL (Technical University of Lisbon), Lisbon, Portugal in 2001. He is currently Full Professor at the Department of Informatics Eng., Faculty of Eng. of the University of Porto, Porto, Portugal, and a research member of INESC TEC. Before, he was with the IST/UTL (2006-2008), a senior researcher at INESC-ID (2001-2009), and with the University of Algarve (1993-2006). In 2001/2002, he worked for PACT XPP Technologies, Inc., Munich, Germany. He has been involved in the organization and served as a Program Committee member for many international conferences. For example, he was general Co-Chair of IEEE/IFIP EUC’2015 and IEEE CSE’2015, General Chair of FPL’2013, General Co-Chair of ARC’2014 and ARC’2006, Program Co-Chair of ARCS’2016, DASIP’2014, and RAW’2010. He has (co-)authored over 150 scientific publications on subjects related to compilers, embedded systems, and reconfigurable computing. He has coordinated a number of research projects. He is a senior member of IEEE, a member of IEEE Computer Society, and a senior member of ACM.  His research interests include compilation techniques, domain-specific languages, reconfigurable computing, application-specific architectures, and high-performance computing with a particular emphasis in embedded computing.

Interest
Topics
Details

Details

  • Name

    João Paiva Cardoso
  • Role

    Senior Researcher
  • Since

    01st July 2011
003
Publications

2025

On improving the HLS compatibility of large C/C plus plus code regions

Authors
Santos, T; Bispo, J; Cardoso, JMP; Hoe, JC;

Publication
2025 IEEE 33RD ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, FCCM

Abstract
Heterogeneous CPU-FPGA C/C++ applications may rely on High-level Synthesis (HLS) tools to generate hardware for critical code regions. As typical HLS tools have several restrictions in terms of supported language features, to increase the size and variety of offloaded regions, we propose several code transformations to improve synthesizability. Such code transformations include: struct and array flattening; moving dynamic memory allocations out of a region; transforming dynamic memory allocations into static; and asynchronously executing host functions, e.g., printf(). We evaluate the impact of these transformations on code region size using three realworld applications whose critical regions are limited by nonsynthesizable C/C++ language features.

2025

Ph.D. Project: Holistic Partitioning and Optimization of CPU-FPGA Applications Through Source-to-Source Compilation

Authors
Santos, T; Bispo, J; Cardoso, JMP;

Publication
2025 IEEE 33RD ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, FCCM

Abstract
Critical performance regions of software applications are often accelerated by offloading them onto an FPGA. An efficient end result requires the judicious application of two processes: hardware/software (hw/sw) partitioning, which identifies the regions for offloading, and the optimization of those regions for efficient High-level Synthesis (HLS). Both processes are commonly applied separately, not relying on any potential interplay between them, and not revealing how the decisions made in one process could positively influence the other. This paper describes our primary efforts and contributions made so far, and our work-in-progress, in an approach that combines both hw/sw partitioning and optimization into a unified, holistic process, automated using source-to-source compilation. By using an Extended Task Graph (ETG) representation of a C/C++ application, and expanding the synthesizable code regions, our approach aims at creating clusters of tasks for offloading by a) maximizing the potential optimizations applied to the cluster, b) minimizing the global communication cost, and c) grouping tasks that share data in the same cluster.

2025

First Twenty Years of the International Symposium on Applied Reconfigurable Computing (ARC): A Selection of Papers

Authors
Cardoso, JMP; Najjar, WA;

Publication
Applied Reconfigurable Computing. Architectures, Tools, and Applications - 21st International Symposium, ARC 2025, Seville, Spain, April 9-11, 2025, Proceedings

Abstract
The International Symposium on Applied Reconfigurable Computing (ARC) is an annual forum for the discussion and dissemination of research, notably applying the Reconfigurable Computing (RC) concept to real-world problems. The first edition of ARC took place in 2005, and in 2024, ARC celebrated its 20th edition. During those 20 years, the field of reconfigurable computing saw a tremendous growth in its underlying technology. ARC contributed very significantly to the presentation and dissemination of new ideas, innovative applications, and fruitful discussions, all of which have resulted in the shaping of novel lines of research. Here, we present selected papers from the first 20 years of ARC, that we believe represent the corpus of work and reflect the ARC spirit by covering a broad spectrum of RC applications, benchmarks, tools, and architectures. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

2024

A Fast and Energy-Efficient Method for Online and Incremental Pareto-Front Update

Authors
Ferreira, PJS; Moreira, JM; Cardoso, JMP;

Publication
10th IEEE World Forum on Internet of Things, WF-IoT 2024, Ottawa, ON, Canada, November 10-13, 2024

Abstract
Self-adaptive Systems (SaS) are becoming increasingly important for adapting to dynamic environments and for optimizing performance on resource-constrained devices. A practical approach to achieving self-adaptability involves using a Pareto-Front (PF) to store the system's hyper-parameters and the outcomes of hyperparameter combinations. This paper proposes a novel method to approximate a PF, offering a configurable number of solutions that can be adapted to the device's limitations. We conducted extensive experiments across various scenarios, where all PF solutions were replaced, and real world scenarios were performed using actual measurements from a Human Activity Recognition (HAR) system. Our results show that our method consistently outperforms previous methods, mainly when the maximum number of PF solutions is in the order of hundreds. The effectiveness of our method is most apparent in real-case scenarios where it achieves, when executed in a Raspberry Pi 5, up to 87% energy consumption reduction and lower execution times than the second-best algorithm. Additionally, our method ensures a more evenly distributed solution across the PF, preventing the high concentration of solutions. © 2024 IEEE.

2024

Proceedings of the 14th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, HEART 2024, Porto, Portugal, June 19-21, 2024

Authors
Josipovic, L; Zhou, P; Shanker, S; Cardoso, JMP; Anderson, J; Yuichiro, S;

Publication
HEART

Abstract