Home/ CSE/ Federated Learning in Healthcare
CSE · Seminar 04 · Train on patient data without moving it

Federated Learning in Healthcare

Federated learning trains a shared model across hospitals by exchanging model updates instead of raw patient records, preserving privacy and regulatory compliance.

federated learningFedAvgprivacyHIPAAdifferential privacy

Medical data is rich but siloed: privacy law, ethics and competition keep records inside each hospital. Federated learning (FL) lets many institutions collaboratively train one model while every patient record stays on-premises. Only model parameters — never raw data — leave the institution.

Working principle

A central server holds the global model. In each round it sends the current weights to participating sites; each site trains locally on its private data and returns only the weight updates (gradients). The server aggregates these — classically by Federated Averaging (FedAvg), a sample-weighted mean — and broadcasts the improved model. Repeating this converges to a model that has effectively learned from all datasets.

1Server sends global model2Local training at hospital3Send weight updates4Secure aggregation (FedAvg)5Update global modelCONTINUOUSCYCLEOne federated training round (cross-silo)
Figure 1. Cross-silo federated round. Raw images and records never leave the hospital; only encrypted parameter updates are transmitted.

Privacy hardening

Updates can still leak information, so FL is combined with differential privacy (calibrated noise added to updates), secure aggregation (the server only sees the sum, not individual contributions), and homomorphic encryption for stronger guarantees.

Table 1. Centralised vs. federated training in healthcare
AspectCentralisedFederated
Data movementPooled to one serverStays at each site
Regulatory riskHigh (HIPAA/GDPR)Lower — data localised
Data diversityLimited to shared setSpans many institutions
Key challengeConsent & transferNon-IID data, comms cost
Key challengeHospital datasets are non-IID — different demographics, scanners and labelling. This statistical heterogeneity is FL's central technical challenge and drives algorithms like FedProx and personalised FL.

Applications

  • Tumour segmentation across hospitals (e.g. brain-tumour federations)
  • Sepsis and readmission risk prediction from ICU data
  • Drug-response and rare-disease modelling spanning institutions

References & further reading

  1. McMahan et al., “Communication-Efficient Learning of Deep Networks from Decentralized Data (FedAvg),” AISTATS 2017.
  2. Rieke et al., “The future of digital health with federated learning,” npj Digital Medicine, 2020.
  3. Li et al., “Federated Optimization in Heterogeneous Networks (FedProx),” MLSys 2020.