Federated Learning in Healthcare

Medical data is rich but siloed: privacy law, ethics and competition keep records inside each hospital. Federated learning (FL) lets many institutions collaboratively train one model while every patient record stays on-premises. Only model parameters — never raw data — leave the institution.

Working principle

A central server holds the global model. In each round it sends the current weights to participating sites; each site trains locally on its private data and returns only the weight updates (gradients). The server aggregates these — classically by Federated Averaging (FedAvg), a sample-weighted mean — and broadcasts the improved model. Repeating this converges to a model that has effectively learned from all datasets.

Figure 1. Cross-silo federated round. Raw images and records never leave the hospital; only encrypted parameter updates are transmitted.

Privacy hardening

Updates can still leak information, so FL is combined with differential privacy (calibrated noise added to updates), secure aggregation (the server only sees the sum, not individual contributions), and homomorphic encryption for stronger guarantees.

Table 1. Centralised vs. federated training in healthcare
Aspect	Centralised	Federated
Data movement	Pooled to one server	Stays at each site
Regulatory risk	High (HIPAA/GDPR)	Lower — data localised
Data diversity	Limited to shared set	Spans many institutions
Key challenge	Consent & transfer	Non-IID data, comms cost

Key challengeHospital datasets are non-IID — different demographics, scanners and labelling. This statistical heterogeneity is FL's central technical challenge and drives algorithms like FedProx and personalised FL.

Applications

Tumour segmentation across hospitals (e.g. brain-tumour federations)
Sepsis and readmission risk prediction from ICU data
Drug-response and rare-disease modelling spanning institutions

References & further reading

McMahan et al., “Communication-Efficient Learning of Deep Networks from Decentralized Data (FedAvg),” AISTATS 2017.
Rieke et al., “The future of digital health with federated learning,” npj Digital Medicine, 2020.
Li et al., “Federated Optimization in Heterogeneous Networks (FedProx),” MLSys 2020.