Publicacoes - INESC TEC

Publicações

Publicações por João Bispo

2021

FPGAs as General-Purpose Accelerators for Non-Experts via HLS: The Graph Analysis Example

Autores
Silva, PF; Bispo, J; Paulino, N;

Publicação
2021 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT)

Abstract
We discuss the concept of FPGA-unfriendliness, the property of certain algorithms, programs, or domains which may limit their applicability to FPGAs. Specifically, we look at graph analysis, which has recently seen increased interest in combination with High-Level Synthesis, but has yet to find great success compared to established acceleration mechanisms. To this end, we make use of Xilinx's Vitis Graph Library to implement Single-Source Shortest Paths (SSSP) and PageRank (PR), and present a custom kernel written from the ground up for Distinctiveness Centrality (DC, a novel graph centrality measure). We use public datasets to test these implementations, and analyse power consumption and execution time. Our comparisons against published data for GPU and CPU execution show FPGA slowdowns in execution time between around 18.5x and 328x for SSSP, and around 1.8x and 195x for PR, respectively. In some instances, we obtained FPGA speedups versus CPU of up to 2.5x for PR. Regarding DC, results show speedups from 0.1x to 3.5x, and energy efficiency increases from 0.8x to 6x. Lastly, we provide some insights regarding the applicability of FPGAs in FPGA-unfriendly domains, and comment on the future as FPGA and HLS technology advances.

FecharLer Abstract

2022

A Flexible HLS Hoeffding Tree Implementation for Runtime Learning on FPGA

Autores
Sousa, LM; Paulino, N; Ferreira, JC; Bispo, J;

Publicação
2022 IEEE 21ST MEDITERRANEAN ELECTROTECHNICAL CONFERENCE (IEEE MELECON 2022)

Abstract
Decision trees are often preferred when implementing Machine Learning in embedded systems for their simplicity and scalability. Hoeffding Trees are a type of Decision Trees that take advantage of the Hoeffding Bound to allow them to learn patterns in data without having to continuously store the data samples for future reprocessing. This makes them especially suitable for deployment on embedded devices. In this work we highlight the features of a HLS implementation of the Hoeffding Tree. The implementation parameters include the feature size of the samples (D), the number of output classes (K), and the maximum number of nodes to which the tree is allowed to grow (Nd). We target a Xilinx MPSoC ZCU102, and evaluate: the design's resource requirements and clock frequency for different numbers of classes and feature size, the execution time on several synthetic datasets of varying sizes (N) and the execution time and accuracy for two datasets from UCI. For a problem size of D=3, K=5, and N=40000, a single decision tree operating at 103MHz is capable of 8.3x faster inference than the 1.2 GHz ARM Cortex-A53 core. Compared to a reference implementation of the Hoeffding tree, we achieve comparable classification accuracy for the UCI datasets.

FecharLer Abstract

2022

E-APK: Energy Pattern Detection in Decompiled Android Applications

Autores
Gregório, N; Fernandes, JP; Bispo, J; Medeiros, S;

Publicação
SBLP

Abstract
Energy efficiency is a non-functional requirement that developers must consider. This requirement is particularly relevant when building software for battery-operated devices like mobile ones: a long-lasting battery is an essential requirement for an enjoyable user experience. It has been shown that many mobile applications include inefficiencies that cause battery to be drained faster than necessary. Some of these inefficiencies result from software patterns that have been catalogued in the literature. The catalogues often provide more energy-efficient alternatives. While the related literature is vast, most approaches so far assume as a fundamental requirement that one has access to the source code of an application in order to be able to analyse it. This requirement makes independent energy analysis challenging, or even impossible, e.g. for a mobile user or, most significantly, an App Store trying to provide information on how efficient an application being submitted for publication is. Our work studies the viability of looking for known energy patterns in applications by decompiling them and analysing the resulting code. For this, we decompiled and analysed 236 open-source applications. We extended an existing tool to aid in this process, making it capable of seamlessly decompiling and analysing android applications. With the collected data, we performed a comparative analysis of the presence of energy patterns between the source code and the decompiled code. While further research is required to more assertively say if this type of static analysis is viable, our results point in a promising direction with 163 applications, approximately 69%, containing the same number of detected patterns in both source code and the release APK.

FecharLer Abstract

2025

Detecting Resource Leaks on Android with Alpakka

Autores
Santos, G; Bispo, J; Mendes, A;

Publicação
PROCEEDINGS OF SLE 2025 18TH ACM SIGPLAN INTERNATIONAL CONFERENCE ON SOFTWARE LANGUAGE ENGINEERING, SLE 2025

Abstract
Mobile devices have become integral to our everyday lives, yet their utility hinges on their battery life. In Android apps, resource leaks caused by inefficient resource management are a significant contributor to battery drain and poor user experience. Our work introduces Alpakka, a source-to-source compiler for Android's Smali syntax. To showcase Alpakka's capabilities, we developed an Alpakka library capable of detecting and automatically correcting resource leaks in Android APK files. We demonstrate Alpakka's effectiveness through empirical testing on 124 APK files from 31 real-world Android apps in the DroidLeaks [12] dataset. In our analysis, Alpakka identified 93 unique resource leaks, of which we estimate 15% are false positives. From these, we successfully applied automatic corrections to 45 of the detected resource leaks.

FecharLer Abstract

2025

TranspileJS, an Intelligent Framework for Transpiling JavaScript to WebAssembly

Autores
Ferreira, JP; Bispo, J; Lima, S;

Publicação
PROCEEDINGS OF SLE 2025 18TH ACM SIGPLAN INTERNATIONAL CONFERENCE ON SOFTWARE LANGUAGE ENGINEERING, SLE 2025

Abstract
WebAssembly (Wasm) has emerged as a powerful binary format, enabling the seamless integration of languages like C and Rust into web applications. JavaScript (JS), the dominant language for client-side web development, has its code susceptible to tampering and intellectual property theft due to its transparency in browser environments. We introduce TranspileJS, a novel tool designed to enhance code security by automatically selecting and translating JS snippets into Wasm. TranspileJS leverages a multi-stage architecture that converts JS to TypeScript, which is compiled into Wasm using the AssemblyScript compiler. TranspileJS addresses the challenges posed by the fundamental differences between JS and Wasm, including dynamic typing, runtime behaviour mismatches, and standard library discrepancies, ensuring that the original behaviour of the code is preserved while maximising the amount of code transpiled. Our experiments show that TranspileJS successfully transpiles approximately one-third of the code in our dataset, with a performance impact of up to a 12.3% increase in execution time. The transpilation process inherently obfuscates code, creating effects similar to standard obfuscation techniques, and generates a stealthy and resilient output. Furthermore, combining transpilation with WebAssembly-specific obfuscation techniques opens new possibilities for code protection and resistance against reverse engineering.

FecharLer Abstract

2025

SIMD Acceleration of Matrix-Vector Operations on RISC-V for Variable Precision Neural Networks

Autores
Salinas, G; Sequeira, G; Rodriguez, A; Bispo, J; Paulino, N;

Publicação
2025 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW

Abstract
The rapid proliferation of Edge AI applications demands efficient, low-power computing architectures tailored to specific workloads. The RISC-V ecosystem is a promising solution, and has led to a fast growth of implementations based on custom instructions extensions, but with varying degrees of functionality and support which may hinder easy adoption. In this paper, we extensively review existing RISC-V extensions targeting primarily the AI domain and respective compilation flows, highlighting challenges in deployment, usability, and compatibility. We further implement and provide usable containerized environments for two of these works. To address the identified challenges, we then propose an approach for lightweight early validation of custom instructions via source-to-source transformations, without need of compiler modifications. We target our own Single Instruction Multiple Data (SIMD) accelerator, which we integrate into a CORE-V cv32e40px baseline core through custom instructions, and versus which we achieve up to 11.9x speedup for matrix-vector operations.

FecharLer Abstract