Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
About

About

My research interests cover data management in eventual consistent settings, distributed data aggregation and causality tracking. In the last years I have collaborated with my co-authors in the development of data summary mechanisms such as Scalable Bloom Filters, causality tracking for dynamic settings with Interval Tree Clocks and Dotted Version Vectors and in predictable eventual consistency with Conflict-Free Replicated Data Types. My recent work has been applied in the Riak distributed database and in Akka distributed data, and is running in production systems serving millions of users worldwide.

Interest
Topics
Details

Details

  • Name

    Carlos Baquero
  • Cluster

    Computer Science
  • Role

    Area Manager
  • Since

    01st November 2011
003
Publications

2022

Picking Publication Targets

Authors
Baquero, C;

Publication
COMMUNICATIONS OF THE ACM

Abstract

2022

Is Having AI Generate Text Cheating?

Authors
Baquero, C;

Publication
COMMUNICATIONS OF THE ACM

Abstract
Carlos Baquero on whether using artificial intelligence provides an unfair advantage to writers.

2022

The Dynamics of Remembering and Forgetting

Authors
Baquero, C; Cabecinhas, R;

Publication
COMMUNICATIONS OF THE ACM

Abstract
[No abstract available]

2021

The CoronaSurveys System for COVID-19 Incidence Data Collection and Processing

Authors
Baquero, C; Casari, P; Anta, AF; Garcia Garcia, A; Frey, D; Garcia Agundez, A; Georgiou, C; Girault, B; Ortega, A; Goessens, M; Hernandez Roig, HA; Nicolaou, N; Stavrakis, E; Ojo, O; Roberts, JC; Sanchez, I;

Publication
FRONTIERS IN COMPUTER SCIENCE

Abstract
CoronaSurveys is an ongoing interdisciplinary project developing a system to infer the incidence of COVID-19 around the world using anonymous open surveys. The surveys have been translated into 60 languages and are continuously collecting participant responses from any country in the world. The responses collected are pre-processed, organized, and stored in a version-controlled repository, which is publicly available to the scientific community. In addition, the CoronaSurveys team has devised several estimates computed on the basis of survey responses and other data, and makes them available on the project's website in the form of tables, as well as interactive plots and maps. In this paper, we describe the computational system developed for the CoronaSurveys project. The system includes multiple components and processes, including the web survey, the mobile apps, the cleaning and aggregation process of the survey responses, the process of storage and publication of the data, the processing of the data and the computation of estimates, and the visualization of the results. In this paper we describe the system architecture and the major challenges we faced in designing and deploying it.

2021

Efficient replication via timestamp stability

Authors
Enes, V; Baquero, C; Gotsman, A; Sutra, P;

Publication
PROCEEDINGS OF THE SIXTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS (EUROSYS '21)

Abstract
Modern web applications replicate their data across the globe and require strong consistency guarantees for their most critical data. These guarantees are usually provided via state-machine replication (SMR). Recent advances in SMR have focused on leaderless protocols, which improve the availability and performance of traditional Paxos-based solutions. We propose Tempo - a leaderless SMR protocol that, in comparison to prior solutions, achieves superior throughput and offers predictable performance even in contended workloads. To achieve these benefits, Tempo timestamps each application command and executes it only after the timestamp becomes stable, i.e., all commands with a lower timestamp are known. Both the timestamping and stability detection mechanisms are fully decentralized, thus obviating the need for a leader replica. Our protocol furthermore generalizes to partial replication settings, enabling scalability in highly parallel workloads. We evaluate the protocol in both real and simulated geo-distributed environments and demonstrate that it outperforms state-of-the-art alternatives. © 2021 ACM.

Supervised
thesis

2021

Optimizing Operation-based Conflict-Free Replicated Data Types

Author
Georges Younes

Institution
UM

2021

Bloom filters for stream windows

Author
Ana Catarina Gomes Rodrigues

Institution
UM

2021

Modeling a Decentralized Market-Based Scheme for Responsive Demands

Author
Tiago de Sousa Garcia

Institution
UP-FEUP

2021

Probabilistic Data Types

Author
Pedro Henrique Moreira Gomes Fernandes

Institution
UM

2021

Test Case Mutations to Improve Tests Quality

Author
Carolina Vasconcelos Castro Azevedo

Institution
UP-FEUP