When thinking about using federated learning, there are several open-source frameworks and software options available. The right choice is highly dependent on the purpose and nature of the use case.
The most important questions you should ask yourself are:
How often do you want to apply federated learning?
How standardized does the setup need to be?
How much support do you need with implementation and maintenance?
In this article, we want to give you an overview of the most prominent open-source frameworks.
All of the projects listed are extremely valuable, and it is great to see that privacy in multi-partner data collaborations, machine learning, and AI is playing an ever-increasing role in today’s world!
Nonetheless, here is a disclaimer:
At Apheris we are building an enterprise-grade platform for federated and privacy-preserving data science. As part of our own research and development, we spent thousands of hours evaluating and working with the different frameworks and are also an active part of the OpenMined Community. During this process, we found it necessary to build a platform that is secure, intuitive to use, easy to deploy, and can be offered as an end-to-end solution to set up multi-partner data collaborations. The federated learning frameworks shown here have guided us in our own development. They often require a large amount of engineering and deployment effort to get them running.
Enterprise Federated Learning
In professional and productive federated learning scenarios one faces many complexities. There is a big difference between a student who is setting up a federated learning architecture for a university project, and if you are implementing it for a use case with multiple partners in the manufacturing, pharmaceutical, finance, or healthcare industry.
To operate an open-source solution in a business context, you need a full team of developers, engineers, and data scientists to build and maintain it. As it lies within federated learning for use in collaboration with partners, legal and contractual frameworks must be set up to secure compliance, privacy, and the IP of each party.
Great challenges arise when handling real-world data, which often require prior data cleansing and harmonization of data structures. Although the open-source frameworks have been in development for years, some are still in a beta stage and should not be used for critical production workloads.
That is why we built the Apheris Platform. If you are further interested in adopting federated learning at your company and to support your evaluation process, we compiled the most critical features for any enterprise-grade solution in our Buyer's Guide to Secure Multi-Partner Data Collaboration.
Open-Source Software for Federated Learning
For proofs-of-concept and experiments, open-source frameworks are often sufficient. In the list below you can find some of the most important frameworks out there. We made a subjective selection, but if you want to delve deeper into open-source frameworks around federated learning, here is an exhaustive list compiled by awesomeopensource.com.
Each of the communities, engineers, and data scientists involved in these projects has made a fantastic effort to further push the research and development in federated learning and privacy-preserving data science:
PySyft + PyGrid
IBM Federated Learning
Enterprise-grade Federated Learning Platforms
FATE (Federated AI Technology Enabler) is an open-source project that aims to support a secure and federated AI ecosystem. FATE is available for standalone and cluster deployment setups. The open-source framework is backed by WeBank, a private-owned neo bank based in Shenzhen, China.
For using and writing custom models for it, you need to have some knowledge of protocol buffers. Besides that it has several components, such as:
FATEFlow is the main component of FATE, and it represents the end-to-end machine learning orchestration pipeline. The pipeline supports machine learning tasks such as data preprocessing, model training and testing, publishing, and serving. FATEFlow supports DAG, scheduling, monitoring, and custom pipeline components.
FederatedML is the component responsible for implementing many standard machine learning algorithms and other utility tools. Supported algorithms include but are not limited to DataIO, Intersect, and OneHot Encoder.
FATEBoard is a collection of visualization/dashboarding tools for federated learning to explore, analyze, and understand models easily. FATEBoard supports both standalone and distributed deployment setups.
FATE Serving is the component responsible for serving federated learning models for production usage. It supports online inferencing, dynamic loading of models, A/B testing scenarios, and caching.
Federated Network is the communication means between federated learning parties
KubeFATE is the distributed systems infrastructure required to manage federated workloads. KubeFATE supports docker-compose and Kubernetes cluster deployment setups.
FATE-Client is an optional component used to interact with different FATE components.
Substra is a federated learning software framework developed by a multi-partner research project around Owkin, a French startup founded in 2016. Substra is focused on the medical field with the purpose of data ownership and privacy. Today, it is used in the MELLODY project for drug discovery in the pharmaceutical industry.
Substra supports a wide variety of interfaces for different types of users. It has a python library for data scientists, command-line interfaces for admins, and graphical user interfaces for project managers and other high-level users. In terms of deployment, Substra involves a complex Kubernetes setup for every node.
The key features of Substra are:
Privacy: Substra uses trusted execution environments (also called enclaves) that enables setting aside private regions for code and data
Traceability: Substra writes all operations on the platform to an immutable ledger
Security: Substra encrypts model updates, data on rest, and network communication
PySyft + PyGrid
PySyft is an open-source Python 3 based library that enables federated learning for research purposes and uses FL, differential privacy, and encrypted computations. It was developed by the OpenMined community and works mainly with deep learning frameworks such as PyTorch and TensorFlow.
PySyft supports two types of computations:
Dynamic computations over data that cannot be seen
Static computations, which are graphs of computations that can be executed later on in a different computing environment
PySyft defines objects, machine learning algorithms, and abstractions. With PySyft, you can't work on real data science problems that involve communication across networks. This would require another library, called PyGrid.
PyGrid implements federated learning on web, mobile, edge devices, and different types of terminals. PyGrid is the API to manage and deploy PySyft at scale. It can be controlled using PyGrid Admin.
PyGrid consists of three different components:
Domain: A Flask based application used to store private data and models for federated learning
Worker: An ephemeral compute instance managed by domain components to perform computations on data
Network: A Flask based application to monitor and control different domain components
Intel® Open Federated Learning is a Python 3 open-source project developed by Intel to implement FL on sensitive data. OpenFL has deployment scripts in bash and leverages certificates for securing communication, but requires the user of the framework to handle most of this by himself.
The library consists of two components: the collaborator, which uses a local dataset to train global models and the aggregator, which receives the model updates and combines them to create the global model. OpenFL comes with a Python API and a command-line interface.
The communication between the nodes is done using mTLS, and hence certificates are required. It is necessary to certify each node in the federation. OpenFL supports lossy and lossless compression of data to reduce communication costs. OpenFL allows developers to customize logging, data split methods, and aggregation logic.
The OpenFL design philosophy is based on the Federated Learning (FL) Plan. It is a YAML file that defines required collaborators, aggregators, connections, models, data, and any required configuration. OpenFL runs on docker containers to isolate federation environments.
TensorFlow Federated (TFF) is a Python 3 open-source framework for federated learning developed by Google. The main motivation behind TFF was Google's need to implement mobile keyboard predictions and on-device search. TFF is actively used at Google to support customer needs.
TFF consists of two main API layers:
Federated Core (FC) API
FC is a programming environment for implementing distributed computations. Each computation performs complex tasks and communicates over the network to coordinate and align. It uses pseudo-code similar abstraction to express program local executable in various target runtimes (mobiles, sensors, computers, embedded systems, etc.) since a performant multi-machine runtime is included with TFF.
Federated Learning (FL) API
A high-level API enables plugging existing machine learning models to TFF without deep-diving into how federated learning algorithms work. FL API is built on top of FC API.
Federated Learning API consists of three main parts:
Models: Classes and helper functions that enable the wrapping of existing models with TFF
Federated Computation Builders: Helper functions to construct federated computations
Datasets: Canned collections of data to use for simulation scenarios
The separation of layers between FL and FC is intended to facilitate work done by different users. Federated Learning API helps machine learning developers to implement FL to TF models and for FL researchers to introduce new algorithms, while the Federated Core API is for systems researchers.
IBM Federated Learning
IBM Federated Learning provides a basic fabric for FL on which advanced features can be added. It is not dependent on any specific machine learning framework and supports different learning topologies, e.g., a shared aggregator, and protocols.
It is meant to provide a solid basis for federated learning that enables a large variety of federated learning models, topologies, learning models, in enterprise and hybrid-Cloud settings. IBM Federated Learning supports multiple machine learning models, including:
Models are written in Keras, PyTorch, and TensorFlow
Linear classifiers/regressions (with regularizer): logistic regression, linear SVM, ridge regression, and more
Decision Tree ID3
Deep Reinforcement Learning algorithms including DQN, DDPG, PPO, and more
NVIDIA CLARA is an application framework designed for healthcare use cases. It includes full-stack GPU-accelerated libraries, SDKs, and reference applications for developers, data scientists, and researchers to create real-time, secure, and scalable federated learning solutions. Active deployment of CLARA can be found at the French startup Therapixel, which is using NVIDIA’s technology to improve the accuracy of breast cancer diagnosis.
NVIDIA CLARA supports the following use cases:
Clara AGX for medical devices
Clara Discovery for Drug Discovery
Clara Guardian for Hospitals
Clara Imaging for Medical Images
Clara Parabricks for Genomics
Enterprise Federated Learning
Federated learning is a powerful technique to train machine learning data while maintaining privacy, and without ever having to share data. Many industries benefit from this approach, such as the healthcare sector, where patient data are considered highly confidential, or in manufacturing, where strong IP protection is needed.
When looking at examples and tutorials of open-source software for federated learning, one is tempted to think to get good results quickly. This is mostly because the data sets provided are clean, you can easily split the data across clients, and get started with simple FL experiments. However, in the real world you face other complexities.
If you are working in an industry that has the highest requirements in compliance, security, and availability, you might need to work with an enterprise-grade platform for federated & privacy-preserving data science, such as Apheris.
We take care of the full end-to-end process, starting from contractual, legal, and compliance frameworks, role-based authentication, privacy controls, traceability via a logging solution, and supporting the full data science workflow. Apheris provides the fastest and most secure way to progress from experimentation to high-impact collaborations with partners that drive tangible business results from AI.
To securely collaborate with partners on AI and sensitive data in an enterprise context, Federated Learning must be seen from a broader perspective. We go more into depth on what it takes to implement federated learning into enterprise MLOps pipelines in our whitepaper.
Reach out if you want to learn more!