Product

Join a network

Build your network

Company

Why co-folding models are here to stay

Co‑folding models, like AlphaFold 3, Boltz‑2, and OpenFold3, can predict the joint 3D structures of two (or more) molecules at the same time. While these models perform well on public benchmarks, they often become less accurate when applied to novel targets underrepresented in the training data.

Co‑folding models, like AlphaFold 3, Boltz‑2, and OpenFold3, can predict the joint 3D structures of two (or more) molecules at the same time, for example the structure of protein-ligand and protein-protein complexes. While these models perform well on public benchmarks, their accuracy often decreases on novel targets that are underrepresented in the training data. ApherisFold is a secure, locally hosted application for using co‑folding models on sensitive proprietary data. By running OpenFold3 and Boltz-2 on their in‑house targets, researchers can assess their domain of applicability on their own structures and easily integrate these models into their wider workflows. The product provides visualization tools, streamlined data preparation support, and generates auditable, reproducible records for use in regulated environments.

Why co-folding models are here to stay

Co-folding models predict protein–ligand and protein–protein complex structures in a single forward pass by leveraging statistical patterns learned from large structural datasets. Traditional physics‑based approaches, such as molecular docking or molecular dynamics, instead simulate molecular interactions using equations derived from first principles (i.e., physics) and parameterized force fields fitted to experimental and quantum‑mechanical data. Co-folding offers a statistical, template‑free route to complex structures at orders of magnitude faster than physics‑based sampling and with less required user expertise. They are particularly useful for:

Proposing plausible binding poses for ligands or interacting proteins, especially when experimental structures are unavailable.
Generating structural hypotheses to rationalize structure–activity relationships, helping, for example, guide lead optimization when crystallographic data is unavailable.
Working across diverse modalities such as antibody–antigen interactions, de novo binder design, and other complex biologics. Co-folding models can suggest realistic conformations and interfaces prior to experimental validation.

The current generation of foundation models follows a rapid trajectory:

Model	License	Key capability	Reference
AlphaFold3	Research‑only, no commercial use	First unified diffusion model for proteins, nucleic acids & small molecules.	https://shorturl.at/j5Y88
Boltz-2	Permissive Open-Source Software	Opens diffusion architecture, adds explicit affinity head (Boltz‑2).	https://shorturl.at/H2NEU
OpenFold3 (OF3)	Full OSS	Full reproduction of AlphaFold3. Results are not released yet	Open community reports

A critical look at model performance

While co-folding methods are rapidly improving and gaining traction as indispensable tools in pharmaceutical, biological, and chemical research, the current generation still has important limitations. Most of these arise from the training data, as current models are trained and evaluated almost exclusively on public Protein Data Bank (PDB) structures. Peter Škrinjar and colleagues examined the consequences of this limited data diversity in their recent study: Have protein‑ligand co‑folding methods moved beyond memorization? The authors constructed an independent benchmark (“Runs N’ Poses”) of 2,600 structures published after the models’ training cut‑offs. They evaluated model performance using stringent success criteria (LDDT‑PLI > 0.8 and ligand‑pose RMSD < 2 Å). They clustered similar structures and analyzed those with few examples in the training data (< 100 examples). They observed an almost linear drop in success rate as training set coverage declined, reaching as low as ~20% for the sparsest bin (Fig. 3 in the paper). The study shows that current co‑folding models make accurate predictions for cases similar to their training data (interpolation) but struggle with distant chemotypes and novel targets (extrapolation).

Screenshot conference debrief, CADD 2025 — “Interpolation works exceptionally well, but extrapolation leaves plenty of room for improvement.” — conference debrief, CADD 2025

It is critical to assess the applicability domain of co-folding models and customize the models to your own research tasks

To fully leverage the strengths of co‑folding models while mitigating their limitations, researchers should:

Benchmark models on in-house data: Different models excel at different tasks. Rapid deployment and side-by-side comparison on proprietary targets help identify the best fit for a given use case, especially since public benchmarks reflect training distributions that often differ significantly from industrial data.
Preserve IP and compliance: Proprietary ligands and targets are often absent from the PDB and cannot be shared with public inference servers due to confidentiality and regulatory constraints. Local model deployment is essential to retain control over sensitive assets.
Fine-tune on proprietary data: Adapting publicly trained models to internal datasets improves accuracy for your specific chemical space and target classes, especially where proprietary ligands or sequences diverge from public training data.
Ensure modular integration and traceability: Tooling should be easy to integrate into existing pipelines, version-controlled, and auditable to support reproducible research and compliance in regulated environments.
Consider joining federated data networks: To increase the diversity and size of the training data and hence increase the model's applicability domain and generalizability - an example of this is the AI Structural Biology network

Industrial-Grade Deployment with ApherisFold

ApherisFold is a self‑contained, on‑premise toolkit that lets computational chemists and biologists deploy and use the latest open models, such as Boltz-2 and OpenFold3, without giving up control over sensitive structures, sequences, or ligand libraries. Let’s now look at the main design choices and implementation considerations:

Pillar	Implementation	Why it matters
Local inference	All components are packaged as Docker containers and can be easily deployed with Docker on-prem or in the cloud. It can also be deployed using AWS CloudFormation.	No data leaves the corporate network; complies with internal security policies.
Stream‑lined data preparation	Data preparation toolkit to support benchmarking and fine-tuning. Supports activities such as deduplication, chain parsing, and MSA validation.	Ensures stable inference and comparable benchmarks across datasets.
Model portfolio	Pre‑installed Boltz‑2 & OF3; users may add custom models; more models will be added in the future.	Compare architectures head‑to‑head on the same proprietary dataset.
Inference API & GUI	Python SDK plus web interface with 3D viewer enabling users to introspect their results within the applications or export the results to their tool of choice.	Effective for both quick visual checks of individual runs and large-scale automated analysis.
Comprehensive metrics	Each model returns the 3D protein structures (in PDB format) with associated confidence metrics, including pLDDT scores and Predicted Aligned Error (PAE) or Predicted Distance Error (PDE) metrics. These are stored locally and downloadable from the web interface.	Rapid triage of false positives; metrics exportable to SD files or dashboards.
Optional fine-tuning on own or partner network datasets	Fine-tune models on own or partner data via the Apheris Gateway.	Enables highest accuracy for your industrial use cases and increased model generalizability for new targets without exposing raw structures.

*This feature isn’t part of the initial release but already prioritized for the next update.

Internally, each run generates a traceable and immutable record, including model version, configuration parameters, and input hashes, ensuring full reproducibility and enabling auditability for regulatory submissions and scientific validation.

Summary & Outlook

Over the past two years, advances in model architecture have improved the predictive accuracy of deep‑learning–based structure‑prediction models. For example, the algorithmic innovations introduced in AlphaFold 3 relative to its predecessor AlphaFold 2 led to significantly better performance on the PoseBusters benchmark, which evaluates both structural accuracy and physical plausibility of protein–ligand complexes. Across multiple target classes, success rates increased substantially; for antibody–antigen complexes in particular, the success rate rose from roughly one‑third to about two‑thirds. Yet because PoseBusters and related assessments are built from public PDB entries, the observed gains by AlphaFold 3 may largely reflect improved interpolation within regions of structural space well-represented in the training data. Co‑folding models perform well in‑distribution but often struggle with out‑of‑distribution predictions (i.e. extrapolation), such as novel folds or ligands with distinct scaffolds. Because benchmarks rarely reflect proprietary sequence or chemotype space, organizations should evaluate model performance on their own data. Building an internal capability to deploy, benchmark, and compare models is the essential first step to fully leveraging the rapid advancements in model development. ApherisFold enables you to:

Standardize data preparation, which is the prerequisite for reliable benchmarks, fine‑tuning and collaborative model training
Benchmark models on your in-house targets to determine their strengths and limitations and assess model applicability for your specific research tasks
Generate traceable and fully local predictions - including records that support compliance and enable reproducible research

Once you know how well a model performs on your in-house data, you can choose to improve it by further customizations such as fine-tuning or post-hoc analyses. To further increase the model's applicability domain and generalizability, you can join a federated data network. The AI Structural Biology (AISB) Network is such an example, addressing the challenge of limited availability of protein–ligand structural data in the public domain. Through the network, participants collaboratively train AI models across their distributed proprietary datasets without ever sharing raw data and while keeping data IP fully protected. These collaborative networks are a powerful next step to improve model performance, but they all start with the same foundation: understanding how models behave on your own data. ApherisFold gives you the infrastructure to do exactly that. It helps you identify model blind spots, generate high‑quality structured inputs, and prepare your organization for local fine‑tuning or secure federated training later on.

For installation steps, API references, or configuration tips access the

full product documentation.

Ready to put OpenFold3 and Boltz-2 to work on your own environment? Check out the ApherisFold website.

Co-Folding AI

AIDrugDiscovery

Share blog post to

Federated learning-based protein language models with Apheris on AWS

In collaboration with AWS, we implemented FRA-LoRA (Full Rank Aggregation of Low-Rank Adapters) in a federated setting to fine-tune ESM-2 across multiple sites, all without sharing raw data. LoRA reduced trainable parameters to <2% of the original model, cutting communication overhead while preserving accuracy.

Case Study: AI Structural Biology (AISB) Network

In the past two years, foundational AI models for structural biology have made important technical progress. Despite these advances, current models still show limited accuracy in areas that are central to drug discovery. The AI Structural Biology (AISB) Network - a collective of large biopharmaceutical companies and Apheris as a federation technology provider - was formed to address these barriers.

In this case study we explore:

The challenges around access to high-quality molecular data for Life Sciences
How the Network consisting of major pharmaceutical companies, prestigious model building partners and Apheris as the technology provider was created to tackle this
Dive into an example initiative: Fine-tuning OpenFold3 on public pharma data to improve accuracy and generalizability

Improving ADMET model performance by collaborating on proprietary data

Improve the accuracy and applicability of ADMET models through collaborative training on proprietary life sciences data - while preserving IP protection

Why co-folding models are here to stay

Why co-folding models are here to stay

A critical look at model performance

It is critical to assess the applicability domain of co-folding models and customize the models to your own research tasks

Industrial-Grade Deployment with ApherisFold

Summary & Outlook

Insights delivered to your inbox monthly

Related Posts