vertically distributed federated learning

Vertically distributed federated learning

We live in a world in which data is an asset to improve processes and overall quality of resources. In order to run advanced analytics on data or train machine learning models, it is important that the data is large and diverse. However, the essential features for the same samples are often distributed across multiple parties that own the data, this is referred to as vertically distributed data.
horizontally and vertically distributed data
At Apheris, we have developed a unique solution to make use of vertically distributed data without giving up on data privacy. We combine cryptographic techniques and data privacy technologies to build an end-to-end privacy-preserving analytics system that is composed of two major steps:

  • STEP 1 - Privacy preserving record linkage: Specialized cryptographic tools allow to link the data that “belongs together” across the parties without revealing any information beyond the fact that these data samples belong together.
  • STEP 2 - Federated and privacy-preserving analytics: The data resides at the data owner and the computation is split across them. Further privacy mechanisms such as adding differentially private noise guarantee that no data is leaked to any other party. Simple analysis tasks as well as complex tasks such as training a machine learning model can be carried out.

Our approach is generic and can be applied to all kinds of industries in which joint computations of vertically distributed data are necessary. The workflow as well as the privacy techniques are explained in detail in our whitepaper, both in general and using the example of a healthcare scenario with vertically distributed genomic and electronic health record (EHR) data.

Get access to our full whitepaper

If you are interested in reading the full use case and there is collaboration potential, we are happy to send you our whitepaper. Just fill out the form below.

Get more info!

If you are interested in our open source technology, make sure to check out our GitHub repository we co-developed, and read our accompanying blogpost we co-authored with OpenMined.

You have a similar use case?

Contact us if you want to find out how we can improve your models via a privacy-preserving federated learning setup.