Publications

Publications by José Orlando Pereira

2025

BLADE - Byzantine-tolerant Learning under an Asynchronous and Decentralized Environment

Authors
Ferreira, G; Alonso, AN; Pereira, J;

Publication
2025 20TH EUROPEAN DEPENDABLE COMPUTING CONFERENCE COMPANION PROCEEDINGS, EDCC-C

Abstract
Machine learning models are growing, with some large language models reaching a scale of billions of trainable parameters. Training these models has since become one of the most data-hungry and computation-heavy tasks. Efforts to distribute the training task mostly follow a federated approach, where a central server oversees the training process. This approach: 1) raises concerns about data privacy; and 2) creates a single point of failure. Current proposals for a fully decentralized approach often rely on costly broadcasts to disseminate model updates and do not tolerate heterogeneity in the training data, as it makes detecting Byzantine contributions harder. We propose BLADE, a generalized fully decentralized (and asynchronous) Byzantine fault-tolerant machine learning algorithm. BLADE was designed to be configurable and adapt to harsh environments, and significantly reduces the communication overhead compared to the state of the art. We performed a comprehensive empirical evaluation, and results confirm models trained with BLADE can achieve an accuracy comparable to a centralized training instance, even if the data distribution among peers is heterogeneous, and robustly aggregate model updates in the presence of Byzantine attacks, and even against sporadic Byzantine majorities.

CloseRead Abstract

2013

DEDIS

Authors
Paulo, J; Pereira, J;

Publication
Proceedings of the 4th annual Symposium on Cloud Computing

Abstract