Decentralized Learning Achieves Centralized Performance via Gibbs Algorithms

Researchers present a theoretical framework showing that decentralized learning can match centralized performance without sharing raw data. Under an empirical risk minimization with relative-entropy regularization framework (ERM-RER) and a forward-backward communication pattern, clients exchange locally computed Gibbs measures rather than datasets. Using the Gibbs measure from client k as the reference for client k+1, the authors prove that the decentralized procedure attains the same population performance as a centralized ERM-RER that has access to all data, provided regularization factors are scaled appropriately with local sample sizes. This reframes collaboration as sharing inductive bias via reference measures, with clear implications for privacy-preserving federated and peer-to-peer training.
What happened
The new paper by Yaiza Bermudez, Samir Perlaza, and Iñaki Esnaola proves for the first time that a decentralized learning protocol can achieve the same performance as a centralized learner without sharing local datasets. The result holds for empirical risk minimization with relative-entropy regularization, implemented as ERM-RER, combined with a forward-backward communication scheme where clients share their locally obtained Gibbs measures as reference priors.
Technical details
The authors analyze a chain-like communication sequence in which the Gibbs measure produced by client k becomes the reference measure for client k+1. They show that, when local regularization weights are scaled in a specific way relative to local sample sizes, the decentralized composition of these Gibbs updates yields the same performance as a centralized ERM-RER that has access to the union of datasets. Key technical ingredients include relative-entropy regularization to control deviation from the reference and an explicit identification of the required scaling for the regularization factors.
Protocol primitives and practical implications
The paper proposes a minimal set of primitives that practitioners can map to implementations:
- •local computation of Gibbs posteriors under ERM-RER
- •forward-backward exchange of these Gibbs measures (no raw data shared)
- •per-client regularization calibrated to local sample sizes
These elements keep communication payloads compact when Gibbs measures admit succinct parameterizations, and they substitute sharing of inductive bias for sharing of examples.
Context and significance
This work reframes collaboration in federated and decentralized learning. Instead of averaging gradients, model weights, or exchanging encrypted gradients, the paper formalizes sharing a distributional prior over models as the communication primitive. That matters because it provides a theoretically grounded pathway to match centralized performance while strengthening data locality and privacy. The result ties together ideas from statistical mechanics (Gibbs measures), information-theoretic regularization, and distributed optimization, and it complements recent trends that promote implicit regularization and distributional model fusion over raw-parameter aggregation.
Limitations and assumptions
The guarantees require precise scaling of regularization parameters with local sample counts and assume that Gibbs measures can be communicated and composed in the proposed manner. Practical efficiency will depend on representation choices for the Gibbs measures, computational cost of sampling or approximating them, and the topology of client communication beyond the chain-like forward-backward pattern studied.
What to watch
Validate the approach on nontrivial model classes and heterogeneous data distributions, and investigate compressed or parametric Gibbs representations to make the scheme practical in real federated systems. Extensions to richer topologies and asynchronous updates are natural next steps.
Scoring Rationale
The paper delivers a novel theoretical guarantee that decentralized protocols can match centralized performance without data sharing, a meaningful advance for federated learning theory. It is primarily theoretical and needs empirical and engineering follow-ups, which keeps the impact in the notable research category.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.

