Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Sobre

Sobre

Eu sou investigador no HASLab e professor na U. Minho. A minha investigação centra-se em sistemas distribuidos confiáveis. Interesso-me principalmente pela gestão de dados, incluindo replicação de bases de dados e processamento de SQL sobre sistemas NoSQL, e por comunicação em grupo, incluindo protocolos de consenso e de difusão epidémica para sistemas em grande escala. Interesso-me também por técnicas e ferramentas para testar, avaliar e observar sistemas distribuídos confiáveis. Mais informação está disponível na minha página pessoal.

Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    José Orlando Pereira
  • Cargo

    Investigador Coordenador
  • Desde

    01 novembro 2011
012
Publicações

2025

Towards Efficient Client-Side Transactions for Heterogeneous Cloud Data Stores

Autores
Sousa, PA; Faria, N; Pereira, J; Alonso, AN;

Publicação
2025 20TH EUROPEAN DEPENDABLE COMPUTING CONFERENCE, EDCC

Abstract
Data intensive applications increasingly make use of multiple data stores in the cloud, providing a diversity of data and query models, as well as durability and scale trade-offs. However, this has a severe impact on reliability, as the key fault-tolerance mechanism for database systems, i.e. ACID transactions, is no longer available. Although it is possible to implement transactions without changes to the database servers, this either requires a proxy server, which compromises scale and availability, or a client-side layer that changes the data schema, excludes legacy applications, and adds significant overhead. We address this challenge with a proposal to delegate functionality from a client-side transactional layer to a server-side query engine such that compatibility with legacy applications is restored. We implemented a proof-of-concept and show that it significantly improves performance for analytical applications.

2025

Rethinking BFT: Leveraging Diverse Software Components with LLMs

Autores
Imperadeiro, J; Alonso, AN; Pereira, J;

Publicação
2025 55TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS-SUPPLEMENTAL VOLUME, DSN-S

Abstract
Diversity is crucial in systems that tolerate Byzantine faults. Traditionally, system builders have relied on standardized interfaces (e.g., POSIX for operating systems) to obtain off-the-shelf components or on n-version programming for custom functionality. Unfortunately, standardized alternatives are rare, and the independent development of multiple versions of the same software is costly and justified only on the most critical applications. In this paper, we show that a limited and focused use of LLMs for translation opens up the possibility of leveraging the existing diversity in functionally equivalent but non-standardized components. Specifically, we show that LLMs can produce functionally correct database query translations with minimal guidance and adapt to diverse data models and query contexts, enabling the use of radically different database models, both SQL and NoSQL, together in a Byzantine fault-tolerant replicated system. We outline an approach to achieve this in practice and discuss future research directions.

2025

Uma extensão de Raft com propagação epidémica

Autores
Gonçalves, A; Alonso, AN; Pereira, J; Oliveira, R;

Publicação
CoRR

Abstract

2025

Towards Adaptive Transactional Consistency for Georeplicated Datastores

Autores
Braga, R; Pereira, J; Coelho, F;

Publicação
40TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING

Abstract
Developers of data-intensive georeplicated applications face a difficult decision when selecting a database system. As captured by the CAP theorem, CP systems such as Spanner provide strong consistency that greatly simplifies application development. AP systems such as AntidoteDB providing Transactional Causal Consistency (TCC), ensure availability in face of network partitions and isolate performance from wide-area round-trip times, but avoid lost-update anomalies only when values can be merged. Ideally, an application should be able to adapt to current data and network conditions by selecting which transactional consistency to use for each transaction. In this paper, we test the hypothesis that a georeplicated database system can be built at its core providing only TCC, hence, being AP, but allow an application to execute some transactions under Snapshot Isolation (SI), hence CP. Our main result is showing that this can be achieved even when all the interaction happens through the TCC database system, without additional communication channels between the participants. A preliminary experimental evaluation with a proof-of-concept implementation using AntidoteDB shows that this approach is feasible.

2025

CRDV: Conflict-free Replicated Data Views

Autores
Faria, N; Pereira, J;

Publicação
Proc. ACM Manag. Data

Abstract
There are now multiple proposals for Conflict-free Replicated Data Types (CRDTs) in SQL databases aimed at distributed systems. Some, such as ElectricSQL, provide only relational tables as convergent replicated maps, but this omits semantics that would be useful for merging updates. Others, such as Pg\_crdt, provide access to a rich library of encapsulated column types. However, this puts merge and query processing outside the scope of the query optimizer and restricts the ability of an administrator to influence access paths with materialization and indexes. Our proposal, CRDV, overcomes this challenge by using two layers implemented as SQL views: The first provides a replicated relational table from an update history, while the second implements varied and rich types on top of the replicated table. This allows the definition of merge semantics, or even entire new data types, in SQL itself, and enables global optimization of user queries together with merge operations. Therefore, it naturally extends the scope of query optimization and local transactions to operations on replicated data, can be used to reproduce the functionality of common CRDTs with simple SQL idioms, and results in better performance than alternatives.