Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por HASLab

2025

Exploiting Trusted Execution Environments and Distributed Computation for Genomic Association Tests

Autores
Brito C.V.; Ferreira P.G.; Paulo J.T.;

Publicação
IEEE Journal of Biomedical and Health Informatics

Abstract
Breakthroughs in sequencing technologies led to an exponential growth of genomic data, providing novel biological insights and therapeutic applications. However, analyzing large amounts of sensitive data raises key data privacy concerns, specifically when the information is outsourced to untrusted third-party infrastructures for data storage and processing (e.g., cloud computing). We introduce Gyosa, a secure and privacy-preserving distributed genomic analysis solution. By leveraging trusted execution environments (TEEs), Gyosa allows users to confidentially delegate their GWAS analysis to untrusted infrastructures. Gyosa implements a computation partitioning scheme that reduces the computation done inside the TEEs while safeguarding the users' genomic data privacy. By integrating this security scheme in Glow, Gyosa provides a secure and distributed environment that facilitates diverse GWAS studies. The experimental evaluation validates the applicability and scalability of Gyosa, reinforcing its ability to provide enhanced security guarantees.

2025

KEIGO: Co-designing Log-Structured Merge Key-Value Stores with a Non-Volatile, Concurrency-aware Storage Hierarchy

Autores
Adao, R; Wu, ZJ; Zhou, CJ; Balmau, O; Paulo, J; Macedo, R;

Publicação
PROCEEDINGS OF THE VLDB ENDOWMENT

Abstract
We present Keigo, a concurrency-and workload-aware storage middleware that enhances the performance of log-structured merge key-value stores (LSM KVS) when they are deployed on a hierarchy of storage devices. The key observation behind Keigo is that there is no one-size-fits-all placement of data across the storage hierarchy that optimizes for all workloads. Hence, to leverage the benefits of combining different storage devices, Keigo places files across different devices based on their parallelism, I/O bandwidth, and capacity. We introduce three techniques-concurrency-aware data placement, persistent read-only caching, and context-based I/O differentiation. Keigo is portable across different LSMs, is adaptable to dynamic workloads, and does not require extensive profiling. Our system enables established production KVS such as RocksDB, LevelDB, and Speedb to benefit from heterogeneous storage setups. We evaluate Keigo using synthetic and realistic workloads, showing that it improves the throughput of production-grade LSMs up to 4x for write-and 18x for read-heavy workloads when compared to general-purpose storage systems and specialized LSM KVS.

2025

Keigo: Co-designing Log-Structured Merge Key-Value Stores with a Non-Volatile, Concurrency-aware Storage Hierarchy (Extended Version)

Autores
Adão, R; Wu, Z; Zhou, C; Balmau, O; Paulo, J; Macedo, R;

Publicação
CoRR

Abstract

2025

Modelling sustainability in cyber-physical systems: A systematic mapping study

Autores
Barisic, A; Cunha, J; Ruchkin, I; Moreira, A; Araújo, J; Challenger, M; Savic, D; Amaral, V;

Publicação
SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS

Abstract
Supporting sustainability through modelling and analysis has become an active area of research in Software Engineering. Therefore, it is important and timely to survey the current state of the art in sustainability in Cyber-Physical Systems (CPS), one of the most rapidly evolving classes of complex software systems. This work presents the findings of a Systematic Mapping Study (SMS) that aims to identify key primary studies reporting on CPS modelling approaches that address sustainability over the last 10 years. Our literature search retrieved 2209 papers, of which 104 primary studies were deemed relevant fora detailed characterisation. These studies were analysed based on nine research questions designed to extract information on sustainability attributes, methods, models/meta-models, metrics, processes, and tools used to improve the sustainability of CPS. These questions also aimed to gather data on domain-specific modelling approaches and relevant application domains. The final results report findings for each of our questions, highlight interesting correlations among them, and identify literature gaps worth investigating in the near future.

2025

Let's Talk About It: Making Scientific Computational Reproducibility Easy

Autores
Costa, L; Barbosa, S; Cunha, J;

Publicação
CoRR

Abstract

2025

CompRep: A Dataset For Computational Reproducibility

Autores
Costa, L; Barbosa, S; Cunha, J;

Publicação
PROCEEDINGS OF THE 3RD ACM CONFERENCE ON REPRODUCIBILITY AND REPLICABILITY, ACM REP 2025

Abstract
Reproducibility in computational science is increasingly dependent on the ability to faithfully re-execute experiments involving code, data, and software environments. However, assessing the effectiveness of reproducibility tools is difficult due to the lack of standardized benchmarks. To address this, we collected 38 computational experiments from diverse scientific domains and attempted to reproduce each using 8 different reproducibility tools. From this initial pool, we identified 18 experiments that could be successfully reproduced using at least one tool. These experiments form our curated benchmark dataset, which we release along with reproducibility packages to support ongoing evaluation efforts. This article introduces the curated dataset, incorporating details about software dependencies, execution steps, and configurations necessary for accurate reproduction. The dataset is structured to reflect diverse computational requirements and methodologies, ranging from simple scripts to complex, multi-language workflows, ensuring it presents the wide range of challenges researchers face in reproducing computational studies. It provides a universal benchmark by establishing a standardized dataset for objectively evaluating and comparing the effectiveness of reproducibility tools. Each experiment included in the dataset is carefully documented to ensure ease of use. We added clear instructions following a standard, so each experiment has the same kind of instructions, making it easier for researchers to run each of them with their own reproducibility tool.The utility of the dataset is demonstrated through extensive evaluations using multiple reproducibility tools.

  • 8
  • 263