Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por HASLab

2025

Logic and Calculi for All on the occasion of Luis Barbosa's 60th birthday

Autores
Madeira, A; Oliveira, JN; Proença, J; Neves, R;

Publicação
JOURNAL OF LOGICAL AND ALGEBRAIC METHODS IN PROGRAMMING

Abstract

2025

Towards Efficient Client-Side Transactions for Heterogeneous Cloud Data Stores

Autores
Sousa, PA; Faria, N; Pereira, J; Alonso, AN;

Publicação
2025 20TH EUROPEAN DEPENDABLE COMPUTING CONFERENCE, EDCC

Abstract
Data intensive applications increasingly make use of multiple data stores in the cloud, providing a diversity of data and query models, as well as durability and scale trade-offs. However, this has a severe impact on reliability, as the key fault-tolerance mechanism for database systems, i.e. ACID transactions, is no longer available. Although it is possible to implement transactions without changes to the database servers, this either requires a proxy server, which compromises scale and availability, or a client-side layer that changes the data schema, excludes legacy applications, and adds significant overhead. We address this challenge with a proposal to delegate functionality from a client-side transactional layer to a server-side query engine such that compatibility with legacy applications is restored. We implemented a proof-of-concept and show that it significantly improves performance for analytical applications.

2025

Rethinking BFT: Leveraging Diverse Software Components with LLMs

Autores
Imperadeiro, J; Alonso, AN; Pereira, J;

Publicação
2025 55TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS-SUPPLEMENTAL VOLUME, DSN-S

Abstract
Diversity is crucial in systems that tolerate Byzantine faults. Traditionally, system builders have relied on standardized interfaces (e.g., POSIX for operating systems) to obtain off-the-shelf components or on n-version programming for custom functionality. Unfortunately, standardized alternatives are rare, and the independent development of multiple versions of the same software is costly and justified only on the most critical applications. In this paper, we show that a limited and focused use of LLMs for translation opens up the possibility of leveraging the existing diversity in functionally equivalent but non-standardized components. Specifically, we show that LLMs can produce functionally correct database query translations with minimal guidance and adapt to diverse data models and query contexts, enabling the use of radically different database models, both SQL and NoSQL, together in a Byzantine fault-tolerant replicated system. We outline an approach to achieve this in practice and discuss future research directions.

2025

Towards Adaptive Transactional Consistency for Georeplicated Datastores

Autores
Braga, R; Pereira, J; Coelho, F;

Publicação
40TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING

Abstract
Developers of data-intensive georeplicated applications face a difficult decision when selecting a database system. As captured by the CAP theorem, CP systems such as Spanner provide strong consistency that greatly simplifies application development. AP systems such as AntidoteDB providing Transactional Causal Consistency (TCC), ensure availability in face of network partitions and isolate performance from wide-area round-trip times, but avoid lost-update anomalies only when values can be merged. Ideally, an application should be able to adapt to current data and network conditions by selecting which transactional consistency to use for each transaction. In this paper, we test the hypothesis that a georeplicated database system can be built at its core providing only TCC, hence, being AP, but allow an application to execute some transactions under Snapshot Isolation (SI), hence CP. Our main result is showing that this can be achieved even when all the interaction happens through the TCC database system, without additional communication channels between the participants. A preliminary experimental evaluation with a proof-of-concept implementation using AntidoteDB shows that this approach is feasible.

2025

CRDV: Conflict-free Replicated Data Views

Autores
Faria, N; Pereira, J;

Publicação
Proc. ACM Manag. Data

Abstract
There are now multiple proposals for Conflict-free Replicated Data Types (CRDTs) in SQL databases aimed at distributed systems. Some, such as ElectricSQL, provide only relational tables as convergent replicated maps, but this omits semantics that would be useful for merging updates. Others, such as Pg\_crdt, provide access to a rich library of encapsulated column types. However, this puts merge and query processing outside the scope of the query optimizer and restricts the ability of an administrator to influence access paths with materialization and indexes. Our proposal, CRDV, overcomes this challenge by using two layers implemented as SQL views: The first provides a replicated relational table from an update history, while the second implements varied and rich types on top of the replicated table. This allows the definition of merge semantics, or even entire new data types, in SQL itself, and enables global optimization of user queries together with merge operations. Therefore, it naturally extends the scope of query optimization and local transactions to operations on replicated data, can be used to reproduce the functionality of common CRDTs with simple SQL idioms, and results in better performance than alternatives.

2025

BLADE - Byzantine-tolerant Learning under an Asynchronous and Decentralized Environment

Autores
Ferreira, G; Alonso, AN; Pereira, J;

Publicação
2025 20TH EUROPEAN DEPENDABLE COMPUTING CONFERENCE COMPANION PROCEEDINGS, EDCC-C

Abstract
Machine learning models are growing, with some large language models reaching a scale of billions of trainable parameters. Training these models has since become one of the most data-hungry and computation-heavy tasks. Efforts to distribute the training task mostly follow a federated approach, where a central server oversees the training process. This approach: 1) raises concerns about data privacy; and 2) creates a single point of failure. Current proposals for a fully decentralized approach often rely on costly broadcasts to disseminate model updates and do not tolerate heterogeneity in the training data, as it makes detecting Byzantine contributions harder. We propose BLADE, a generalized fully decentralized (and asynchronous) Byzantine fault-tolerant machine learning algorithm. BLADE was designed to be configurable and adapt to harsh environments, and significantly reduces the communication overhead compared to the state of the art. We performed a comprehensive empirical evaluation, and results confirm models trained with BLADE can achieve an accuracy comparable to a centralized training instance, even if the data distribution among peers is heterogeneous, and robustly aggregate model updates in the presence of Byzantine attacks, and even against sporadic Byzantine majorities.

  • 8
  • 258