Pharmaceutical AI initiatives have been constrained by a fundamental trade-off: collaboration improves models, but data sharing introduces unacceptable IP and privacy risks. The Federated OpenFold3 Initiative demonstrates that this trade-off is no longer necessary.
In under ten weeks, five pharmaceutical companies jointly trained a state-of-the-art structural biology model, without sharing any underlying data and without compromising enterprise security boundaries. OpenFold3 (OF3) was fine-tuned across proprietary structural datasets from AbbVie, Johnson & Johnson, Astex Pharmaceuticals, Bristol Myers Squibb, and Takeda, each contributing several thousand experimentally determined protein/small-molecule structures. The resulting federated OpenFold3 model shows stronger interface-focused metrics and a broader applicability domain than the public OpenFold3 and any single-party model in the comparison. Training ran entirely inside each company’s secure environment, with no confidential data leaving its boundary. Locally trained model updates were aggregated in a privacy-preserving manner, resulting in a federated checkpoint that learns from signal across all participating datasets while each partner retains full control of its data and IP. This article outlines what it took to set the bar for federated learning execution in a multi-pharma setup.
OF3 was not built for a cross-company federated setup. Running it across five independent enterprise environments introduced additional constraints on memory usage, communication patterns, synchronization and fault tolerance. Before any production training could begin, the system had to be hardened and stress-tested under those conditions. We worked closely with Dr. AlQuraishi’s team to adapt OpenFold3 for a heterogeneous, multi-region network spanning five separate enterprise environments. This involved aligning the model and orchestration layer with the memory, communication and infrastructure constraints inherent to cross-company federation.
Apheris’ federation capability is underpinned by NVIDIA FLARE, providing an enterprise-grade solution to cross-company training under strict enterprise constraints. During preparatory testing for our federated setup, the Apheris team uncovered and reproduced a memory leak in a serialization package used by NVIDIA FLARE that would have risked stopping production runs (Figure 1). We validated the root cause and developed tested fixes to stabilize training, contributing the changes back to the open-source project (c.f., https://github.com/NVIDIA/NVFlare/pull/3828)
With these technical foundations in place, the Federated OF3 Initiative operated under three structural constraints typical for industrial pharma but for the first time synchronized across five independent organizations:
Alignment on security and infrastructure: training had to run inside five separate enterprise environments, each with their own security, infrastructure, and approval processes. Federated training had to be meticulously coordinated to minimize compute idle time and maximize training time.
Alignment of compute windows: the fine-tuning workload focused on a compute-intensive co-folding model and was trained on highly sensitive data, only accessible to the owning pharma. The one-month training window had to be aligned across all partners.
Model delivery requirements: the output had to withstand internal scrutiny: Partners needed a federated checkpoint they could benchmark against proprietary baselines.
For the Federated OF3 Initiative we took the decision to focus on synchronous Federated Learning techniques. This was primarily to ensure that all pharma participants contributed equally to the trained model and mitigate the risk of bias if some partners train faster than others. At this scale, federated training works only if all environments are ready simultaneously. A successful synchronized training window requires preparation, and nimbleness to resolve the unknown unknowns. If one environment cannot run, the entire cycle pauses. Infrastructure readiness therefore sets the pace for the whole initiative.
Infrastructure readiness alone would not have been sufficient. Federated co-folding required deliberate alignment at the data layer. Each partner contributed several thousand experimentally determined protein/small-molecule structures prepared under different internal conventions and preprocessing histories. Harmonizing these datasets required close collaboration with computational scientists and structural biology SMEs at each company. Work focused on:
Converging on a shared schema and preprocessing standard
Running structural and metadata validation locally
Aligning inclusion criteria and quality thresholds
Agreeing a standard for blind dataset splitting to mitigate data leakage between pharmas
Resolving edge cases surfaced during early dry runs
All validation occurred locally so data never left partner environments. Discrepancies were resolved through joint technical discussions and federated evaluations as central inspection wasn't an option. By the start of training, each environment was working with an independently prepared dataset that matched a shared specification. This alignment process establishes a practical blueprint for cross-company data interoperability, one of the most persistent challenges in collaborative AI. Importantly, it shows that standardization can be achieved without centralizing data.
A federated training cycle spanning five pharmaceutical enterprises required close coordination between infrastructure, governance, and security teams. Pharmaceutical companies have one common denominator when software needs to be installed in their environments: rigor. Not meeting their understandably thorough requirements can result in a lack of trust, project delays, or even failure. We therefore treated partner approvals and deployment steps as a defined checklist that had to be completed at each company before federation could start. Each partner:
Vetted then approved the federated computing product under strict internal IT and security requirements
Deployed the Gateway inside its secure environment with support available from Apheris if required
Configured controlled outbound communication for aggregation
A standardized deployment footprint was used across all participants, leveraging Amazon Web Services (AWS) infrastructure to provide compute for training evaluation workflows. The federated layer integrated into existing enterprise infrastructure without architectural redesign, allowing five organizations to move in parallel while keeping integration friction low. The use of this standardized footprint demonstrates that federation can integrate into existing enterprise environments without requiring architectural redesign, an essential condition for scalability across the industry. After deployment, the result was a world-first setup operated within each company’s security boundary:
Total compute: 10 p5.48xlarge instances were required to make coordinated use of a combined 80 H100 GPUs over close to one month of continuous training.
Per-partner allocation: 16 GPUs per partner.
Orchestration footprint: The Apheris Gateway was installed via Helm chart or on-prem installer; training data was mounted directly on computation pods to optimize performance; networking was configured as egress-only (ingress is not an option in these environments).
Training cadence: Aggregation was performed under very stringent network constraints (10 mb/s). Models were too large to aggregate every batch, so aggregation ran every 10 optimizer steps; a single step accumulated gradients over 8 batches. The gradient accumulation was necessary to match the pre-training regime. We established these hyperparameters during initial experiments on the NVIDIA DGX Cloud - click here to find out more.
Hosting: The setup is hosting agnostic and works on any cloud/on-prem.
Once deployments were live, the infrastructure was stable and predictable enough to support a multi-week federated training run across all partners.
In synchronous federated training each round advances only after every participating company has completed its local step and transmitted its update. The overall pace is therefore determined by the slowest environment. Minor performance differences compound quickly. This dynamic is inherent to cross-company federation and becomes more pronounced as workloads scale.
To make this concrete, consider a federated run across five pharmaceutical companies. Every few minutes, each company completes a local training step and returns a model update of roughly 2 GB. If four companies finish in two minutes but one takes 24 seconds longer, the aggregation step cannot begin until that final update arrives. For 100 rounds, that seemingly small 24s delay accumulates to more than 40 minutes. In over 500 rounds, it extends to several hours.
Because compute reservations were scheduled in advance across all partners, timing was tightly coupled from the start. That coordination did not eliminate performance differences between environments, but it made them visible immediately. During early runs we found that available network bandwidth could vary considerably between participants. Because roles and escalation paths had been carefully designed and agreed up front, the technical leads could treat it as a shared engineering issue rather than a local firefight. Engineers from the partner organizations, AWS, and Apheris worked together to identify and remove the bottleneck, and the issue was resolved without letting the schedule drift. To keep federated runs stable over many rounds, two operating principles mattered.
First, participation was gated: a company entered a federated cycle only after compute provisioning, deployment validation, and security approvals were complete; if a partner was not ready, they joined the subsequent cycle.
Second, environments were monitored closely during execution so deviations were caught early, before small delays accumulated across rounds.
While synchronous federation introduces coordination dependencies, where delays in one environment can affect the overall cycle, these risks were mitigated through strict readiness gating, predefined escalation paths, and continuous monitoring. With that approach we avoided a run that slowly degraded into waiting time and rescheduling.
The Federated OpenFold3 Initiative showed that multi-pharma federation can function as a coordinated, repeatable operating model and still deliver a state-of-the-art checkpoint on an ambitious, pre-defined timeline. The initiative establishes not just a one-time success, but a repeatable operating model for industrial federation where subsequent training cycles can be executed faster, with lower coordination overhead, and with compounding performance gains. Just as importantly, it left behind infrastructure and working habits that make the next cycle easier than the first: deployments that can be repeated, governance for releasing checkpoints, and a shared way to evaluate updates locally on proprietary benchmarks. That last piece is where the value starts to compound, because teams can compare new releases against their baselines, integrate what holds up, and iterate without re-opening the same operational work each time. The AISB Network is now building on this foundation, expanding the OpenFold3 Initiative to additional modalities and extending into broader small- and large-molecule use cases. We are welcoming additional partners to join the industry’s largest federated structural biology network, already operating successfully at scale.
Authors: Ian Hales, Nicolas Gautier, Benedict W. J. Irwin, Avelino Javer, José-Tomás (JT) Prieto, Mark Sharpley