Product

Join a network

Build your network

Company

Federated learning trends: from academic insights to industry applications

In December, the annual NeurIPS conference took place. With the biggest conference in machine learning research happening, I was curious to see what was going on in the world of federated learning. Here are my 5 take-away on how federated learning can be applied in industry today.

In December, the annual NeurIPS conference took place.

With the biggest conference in machine learning research happening - 3584 accepted papers, 14 tutorials, and 58 workshops - I was curious to see what was going on in the world of federated learning and what emerging trends are there.

As it also came up in a conversation with a customer, it got me thinking about how this would be applied to industry and how the businesses we work with might benefit.

There were a total of 37 papers covering "federated learning". I categorized these into:

Federated learning infrastructure
Privacy and security
Other learnings

Below is the outcome of my summarization of key trends from academic research and potential applications in industry.

Recap: what is federated learning (FL)?

Federated learning: what is the technology about?

Federated learning, originally proposed by Google, is a technique where the training of machine learning models is distributed across multiple devices or servers, known as nodes.

Instead of sharing data, these nodes train models locally and only exchange model updates or parameters.

This decentralized structure ensures that the actual data remains on the local device and that only intermediate parameters are exchanged, seemingly providing an elegant solution to enhance privacy and security.

However, without proper investigation of the inner workings of the FL machinery, security risks, such as training data poisoning or data leakage, can materialize into serious operational, privacy, and legal risks, leading to serious consequences including reputational, economic, societal, and individual harms, to name a few.

Papers and references in this field are increasingly being published (see from google trends below), but not all findings make it to industry.

Trend data - federated learning over time — Federated learning trend over time

Key learnings about federation - what industry can take from research

Enhancing model security and robustness

What research says: The security of FL systems is a hot topic, with studies like RECESS and Lockdown focusing on defending against data breaches and model poisoning.

Why it’s relevant: For industries handling sensitive information, like finance and healthcare, robust security protocols in FL are non-negotiable.

When and how this should be considered or applied: For example, banks can implement FL security protocols, including network protocols and governance policies, to safeguard customer data. Healthcare institutions can use secure FL models to protect patient records from cyber threats, especially when combined with additional governance, privacy, and security controls.

Enhancing personalization and privacy of learning algorithms

What research says: Personalization in FL is gaining traction, as seen in methodologies like pFedHR. Simultaneously, privacy-preserving techniques such as Meta-Variational Dropout are being developed to protect user data.

Why it’s relevant: In customer-centric industries like e-commerce or banking, personalization is key to user engagement, while data privacy remains paramount.

When and how this should be considered or applied: Digital platforms can employ FL for personalized content recommendations while adhering to data protection and privacy regulations and AI governance requirements. For example, finance sectors can adopt privacy-preserving FL methods to secure sensitive customer data while providing tailored product recommendations.

Addressing data and model heterogeneity

What research says: Research in FL is increasingly focused on tackling the challenges posed by heterogeneous data and models. Papers like EvoFed and FedICON emphasize strategies to manage data diversity across different clients, ensuring robust model performance despite varied data sources.

Why it’s relevant: In industries like healthcare, where data sources are diverse and vast, addressing heterogeneity is crucial for accurate and reliable insights. Accurately accounting for diversity of population or data spaces minimizes the risk of inserting representation bias into ML models.

When and how this should be considered or applied: Healthcare systems can leverage FL to analyze patient data from multiple hospitals while ensuring data privacy.

For example, different hospitals may code similar symptoms in different ways. Also, when working on content personalisation, regional variations (such as language, etc.) may lead to degraded model performance.

A necessary data preparation step is needed prior to FL to make sure data is free of historic, representation, or measurement biases.

4. Improving communication and algorithmic efficiency (reducing costs)

What research says: Efforts to reduce communication costs in FL are significant, with FedSep being a prime example. This is crucial for efficiency in FL processes.

Why it’s relevant: Industries with limited bandwidth, like remote machine or IoT deployments, benefit greatly from reduced data transmission requirements.

When and how this should be considered or applied: IoT devices or machinery in remote areas can use FL to process data locally or at the edge, reducing the need for continuous data transmission and saving bandwidth.

For example, machine manufacturers need operational fleet data to be able to predict failure or pre-empt maintenance. Applying FL means data can be accessed algorithmically without transmission, and thereby reducing connectivity costs.

Federated learning convergence and optimization

What research says: Convergence in FL algorithms is crucial for accurate models. Research is focusing on optimization techniques to enhance this aspect of FL.

Why this is relevant: All industries need convergent models or the model doesn't work. However, rapid decision-making industries, such as healthcare diagnostics or stock trading, require models that converge quickly and efficiently.

When and how this should be considered or applied: FL can be used in analysis, with algorithms designed for fast convergence to aid in timely decision-making. The quality of the data, a clever architecture design, and use of hyperparameters is very important.

FL can help in getting more data diversity by allowing new actors to participate in training. When combined with additional privacy and security controls, FL begins to unlock complementary data value in cross-boundary collaborations.

Emerging trends in federated technologies: Practical application challenges

Several papers address specific applications or practical challenges in FL infrastructure, privacy, and security.

Discussions on autonomous driving datasets and image classification tasks suggest a focus on real-world implications and use cases of FL.

The emphasis on novel algorithmic approaches and theoretical contributions introduce new perspectives and techniques to the FL field.

Together they collectively highlight ongoing research and development efforts in federated learning and the application to real-world scenarios.

Reference papers:

RECESS: Vaccine for Federated Learning
Lockdown: Backdoor Defense for Federated Learning
Towards Personalized Federated Learning via Heterogeneous Model Reassembly
Federated Learning via Meta-Variational Dropout
EvoFed: Leveraging Evolutionary Strategies for Communication-Efficient FL
FedICON: Is Heterogeneity Notorious? Taming Heterogeneity to Handle Test-Time Shift in Federated Learning
Resolving the Tug-of-War: A Separation of Communication and Learning in Federated Learning

Federated learning & analytics

Machine learning & AI

Share blog post to

Mastering the Compliance Challenge in AI

Discover the significance of agility and compliance in today's evolving regulatory landscape and delve into how a federated learning approach, backed by robust governance, can empower organizations to innovatively and securely harness sensitive data for trusted AI solutions.

Data collaboration vs data sharing: Everything you should know

by Marie Roehm

Discover why data collaboration is the future. Explore how federated learning keeps data safe while promoting collaboration without sharing.

Collaboration

Collaborative data ecosystems

Hey, your data is on mute

by José-Tomás (JT) Prieto, PhD

Reaching universal agreement on how to regulate AI is impossible. But a carefully designed cyclical process can lead to efficiencies in an enterprise's journey towards trustworthy, accountable, and sustainable AI. In this blog, I navigate the regulation space through the eyes of a data custodian.

Working at Apheris

Data & analytics