Apheris logo
Menu

Fine-tuning OpenFold3 on a small set of structures: The PDE10A case study

We fine-tuned OpenFold3 on just 10 PDE10A protein–ligand complexes and evaluated on 17 held-out structures. Even this low-n setup corrected systematic pose errors and improved interface metrics, making predictions more usable for design decisions.

The goal of this experiment was to assess whether fine-tuning the public OpenFold3 model on a very small set of protein–ligand structures can meaningfully improve prediction quality for a specific drug discovery context, without degrading general performance. We focused on human phosphodiesterase 10A (PDE10A), a target class where pose accuracy and SAR alignment are critical for design decisions. This short blog documents a concrete fine-tuning experiment on human phosphodiesterase 10A (PDE10A). The goal is to:

  1. show that small numbers of liganded structures can materially increase the capabilities of co-folding models, and

  2. provide enough detail that a structural modelling team could reproduce or extend the experiment in their own stack.

1. Question and setup

The practical question was:

Can we correct systematic pose errors of OpenFold3 for a specific target and chemotype using only a handful of protein–ligand complexes?

We chose to evaluate OF3 finetuning on a subset of 27 structures from the PDE10A dataset published by Roche in 2022, as it met the two key requirements:

  • Held-out to the OF3 training set (since published after the training cutoff in 2021)

  • The base OF3 model was performing poorly on these structures

We treated this as a realistic “low-n” fine-tuning scenario:

  • Target: human PDE10A

  • Model: OpenFold3 (public weights as of late 2025)

  • Hardware: single NVIDIA H100 GPU

  • Total wall-clock for fine-tuning: ~20 hours

2. Data: training and evaluation splits

Training set (10 complexes, used for fine-tuning) PDB IDs:

5SDY, 5SIQ, 5SI7, 5SIG, 5SI5, 5SI8, 5SIY, 5SG5, 5SGL, 5SIH

Evaluation set (17 held-out complexes, no gradient updates) PDB IDs:

5SH0, 5SE0, 5SHR, 5SJL, 5SH8, 5SF4, 5SFG, 5SE5, 5SHK, 5SEE, 5SFL, 5SJU, 5SKE, 5SKU, 5SKO, 5SEA, 5SKR

3. Fine-tuning protocol

We followed a minimal, single-target fine-tuning setup:

  • Starting weights: public OpenFold3

  • Objective: standard OpenFold3 structure loss on protein–ligand complexes

  • Steps: ~350 gradient steps over the 10 training complexes

  • Optimiser / schedule: Decreased EMA decay factor to 0.99 from default 0.999 to allow meaningful model change in a small number of gradient steps; decreased learning rate warmup steps from 1000 to 50; decreased learning rate from 0.0018 to 0.0003

  • MSAs / templates: as in baseline OpenFold3, but without use of templates

All training was run via ApherisFold in a private environment, with the PDE10A complexes staying inside the organisation’s own infrastructure.

4. Evaluation protocol

For each complex in the 17-structure evaluation set we:

  1. Ran 5 independent co-folding samples with the baseline OpenFold3 model.

  2. Ran 5 independent co-folding samples with the fine-tuned model.

  3. For each model, selected the highest-confidence sample according to the model’s internal confidence (plDDT).

  4. Computed the following metrics vs. the experimental structure:

    • global GDT

    • intra-protein lDDT

    • intra-ligand lDDT

    • protein–ligand interface lDDT

    • DockQ (interface-focused composite score)

5. Results

The bar plot (baseline vs fine-tuned across these metrics, with error bars across the 17 structures) summarises the results.

Results PDE10A case study

Qualitatively:

  • For 5SH8, the baseline model places the ligand in a wrong pose relative to the pocket; key ring systems are mis-registered and several interactions are missing.

  • The fine-tuned model produces a pose that overlays well with the crystal structure and recovers the expected interaction pattern.

  • Inspection shows that the fine-tuned model appears to combine elements of two training complexes (notably 5SDY and 5SI7) to achieve the correct binding mode.

The complex 5SH8 (see purple structure) was used as a central qualitative example:

  • Pink: baseline OpenFold3 prediction (incorrect pose)

  • Yellow: the significantly more accurate prediction of the finetuned model

  • Purple: the reference structure of complex 5SH8

  • Green and grey: two most similar structures to 5SH8 (based on spatial overlap) used for fine-tuning (5SDY and 5SI7) to the purple reference structure of the held-out structure. These two training structures are particularly close analogues and illustrate how the fine-tuned model “interpolates” between known binding modes when predicting 5SH8.

Output structures of the PDE10A case study

Quantitatively (across all 17 held-out complexes):

  • All structural metrics showed a clear shift in favour of the fine-tuned model.

  • Improvements were most pronounced at the protein–ligand interface (interface lDDT and DockQ), which is exactly where medicinal chemists care about accuracy.

  • Global protein metrics (GDT, intra-protein lDDT) improved slightly but were already high; the main gain was correcting ligand orientation and local interactions.

Overall, fine-tuning on 10 carefully chosen complexes was enough to move predictions for this chemotype from qualitatively unreliable to experimentally plausible for decision support.


AI Drug Discovery
Co-Folding AI
Machine Learning
Share blog post to

Insights delivered to your inbox monthly

Related Posts

Why co-folding models are here to stay
Co‑folding models, like AlphaFold 3, Boltz‑2, and OpenFold3, can predict the joint 3D structures of two (or more) molecules at the same time. While these models perform well on public benchmarks, they often become less accurate when applied to novel targets underrepresented in the training data.
Co-Folding AI
AI Drug Discovery
Apheris Launches ApherisFold to Make OpenFold3 Securely Usable in Pharma Environments
Apheris, a leading provider of AI applications for drug discovery, today announced the launch of ApherisFold, an enterprise software product that enables pharmaceutical organizations to securely run, benchmark, and fine-tune the latest co-folding models, including OpenFold3 and Boltz-2, directly within their own IT environments.
Co-Folding AI
Product Release