Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
About

About

I'm a PhD student in INESC TEC and Faculty of Engineering of the University of Porto (FEUP), at the Department of Informatics Engineering (DEI). I obtained my MSc degree in 2011, also in FEUP, with the final dissertation about the development of an Aspect-Oriented Programming language named LARA. I work with compiler-related topics such as domain-specific languages and compiler optimizations. Current work focus on runtime adaptability and specialization following an aspect-oriented programming methodology.

Interest
Topics
Details

Details

  • Name

    Tiago Diogo Carvalho
  • Role

    Senior Researcher
  • Since

    01st January 2013
Publications

2024

Foundations for a Rust-Like Borrow Checker for C

Authors
Silva, T; Bispo, J; Carvalho, T;

Publication
PROCEEDINGS OF THE 25TH ACM SIGPLAN/SIGBED INTERNATIONAL CONFERENCE ON LANGUAGES, COMPILERS, AND TOOLS FOR EMBEDDED SYSTEMS, LCTES 2024

Abstract
Memory safety issues in C are the origin of various vulnerabilities that can compromise a program's correctness or safety from attacks. We propose a different approach to tackle memory safety, the replication of Rust's Mid-level Intermediate Representation (MIR) Borrow Checker, through the usage of static analysis and successive source-to-source code transformations, to be composed upstream of the compiler, thus ensuring maximal compatibility with most build systems. This allows us to approximate a subset of C to Rust's core concepts, applying the memory safety guarantees of the rustc compiler to C. In this work, we present a survey of Rust's efforts towards ensuring memory safety, and describe the theoretical basis for a C borrow checker, alongside a proof-of-concept that was developed to demonstrate its potential. This prototype correctly identified violations of the ownership and aliasing rules, and accurately reported each error with a level of detail comparable to that of the rustc compiler.

2024

Time-predictable task-to-thread mapping in multi-core processors

Authors
Samadi, M; Royuela, S; Pinho, LM; Carvalho, T; Quinones, E;

Publication
JOURNAL OF SYSTEMS ARCHITECTURE

Abstract
The performance of time-predictable systems can be improved in multi-core processors using parallel programming models (e.g., OpenMP). However, schedulability analysis of parallel applications is a big challenge due to their sophisticated structure. The common drawbacks of current task-to-thread mapping approaches in OpenMP are that they (i) utilize a global queue in the mapping process, which may increase contention, (ii) do not apply heuristic techniques, which may reduce the predictability and performance of the system, and (iii) use basic analytical techniques, which may cause notable pessimism in the temporal conditions. Accordingly, this paper proposes a task-to-thread mapping method in multi-core processors based on the OpenMP framework. The mapping process is carried out through two phases: allocation and dispatching. Each thread has an allocation queue in order to minimize contention, and the allocation and dispatching processes are performed using several heuristic algorithms to enhance predictability. In the allocation phase, each task-part from the OpenMP DAG is allocated to one of the allocation queues, which includes both sibling and child task-parts. A suitable thread (i.e., allocation queue) is selected using one of the suggested heuristic allocation algorithms. In the dispatching phase, when a thread is idle, a task-part is selected from its allocation queue using one of the suggested heuristic dispatching algorithms and then dispatched to and executed by the thread. The performance of the proposed method is evaluated under different conditions (e.g., varying the number of tasks and the number of threads) in terms of application response time and overhead of the mapping process. The simulation results show that the proposed method surpasses the other methods, especially in the scenario that includes overhead of the mapping. In addition, a prototype implementation of the main heuristics is evaluated using two kernels from real-world applications, showing that the methods work better than LLVM's default scheduler in most of the configurations.

2023

A DSL-based runtime adaptivity framework for Java

Authors
Carvalho, T; Bispo, J; Pinto, P; Cardoso, JMP;

Publication
SOFTWAREX

Abstract
This article presents Kadabra, a Java source-to-source compiler that allows users to make code queries, code analysis and code transformations, all user-programmable using the domain-specific language LARA. We show how Kadabra can be used as the basis for developing a runtime autotuning and adaptivity framework, able to adapt existing source Java code in order to take advantage of runtime autotuning. Specifically, this article presents the framework, consisting of Kadabra and an API for runtime adaptivity. We show the use of the framework to extend Java applications with autotuning and runtime adaptivity mechanisms to target performance improvement and/or energy saving goals.(c) 2023 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

2023

Framework for the Analysis and Configuration of Real-Time OpenMP Applications

Authors
Carvalho, T; Pinho, LM; Samadi, M; Royuela, S; Munera, A; Quiñones, E;

Publication
2023 IEEE 21ST INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS, INDIN

Abstract
High-performance cyber-physical applications impose several requirements with respect to performance, functional correctness and non-functional aspects. Nowadays, the design of these systems usually follows a model-driven approach, where models generate executable applications, usually with an automated approach. As these applications might execute in different parallel environments, their behavior becomes very hard to predict, and making the verification of non-functional requirements complicated. In this regard, it is crucial to analyse and understand the impact that the mapping and scheduling of computation have on the real-time response of the applications. In fact, different strategies in these steps of the parallel orchestration may produce significantly different interference, leading to different timing behaviour. Tuning the application parameters and the system configuration proves to be one of the most fitting solutions. The design space can however be very cumbersome for a developer to test manually all combinations of application and system configurations. This paper presents a methodology and a toolset to profile, analyse, and configure the timing behaviour of highperformance cyber-physical applications and the target platforms. The methodology leverages on the possibility of generating a task dependency graph representing the parallel computation to evaluate, through measurements, different mapping configurations and select the one that minimizes response time.

2022

Configuration of Parallel Real-Time Applications on Multi-Core Processors

Authors
Gharajeh, MS; Carvalho, T; Pinho, LM;

Publication
2022 IEEE 20TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN)

Abstract
Parallel programming models (e.g., OpenMP) are more and more used to improve the performance of real-time applications in modern processors. Nevertheless, these processors have complex architectures, being very difficult to understand their timing behavior. The main challenge with most of existing works is that they apply static timing analysis for simpler models or measurement-based analysis using traditional platforms (e.g., single core) or considering only sequential algorithms. How to provide an efficient configuration for the allocation of the parallel program in the computing units of the processor is still an open challenge. This paper studies the problem of performing timing analysis on complex multi-core platforms, pointing out a methodology to understand the applications' timing behavior, and guide the configuration of the platform. As an example, the paper uses an OpenMP-based program of the Heat benchmark on a NVIDIA Jetson AGX Xavier. The main objectives are to analyze the execution time of OpenMP tasks, specify the best configuration of OpenMP directives, identify critical tasks, and discuss the predictability of the system/application. A Linux perf based measurement tool, which has been extended by our team, is applied to measure each task across multiple executions in terms of total CPU cycles, the number of cache accesses, and the number of cache misses at different cache levels, including L1, L2 and L3. The evaluation process is performed using the measurement of the performance metrics by our tool to study the predictability of the system/application.