Apheris logo
Menu

Protenix-v1 is now available in ApherisFold

Protenix-v1, a fully open co-folding model reporting AlphaFold3-level performance under matched conditions, is now available in ApherisFold. Teams can run and benchmark it locally on proprietary datasets, compare it to OpenFold3 and Boltz-2, and evaluate performance within real DMTA workflows.

Protenix-v1, a fully open-source co-folding model reported to reach AlphaFold3-level performance, is now available in ApherisFold. With Protenix-v1 now integrated in ApherisFold, teams can benchmark it side-by-side against OpenFold3 and Boltz-2 on in-house targets and datasets, and decide where it is reliable enough to support structure-informed choices within DMTA cycles.

Why Protenix-v1 is a meaningful addition for DMTA-driven programs

Protenix-v1 is presented as the first fully open-source biomolecular structure prediction model to reach AlphaFold3-level performance under matched training data cutoff, model size, and inference budget. This matters because only under comparable training conditions can algorithmic performance be meaningfully evaluated, so observed differences are less likely to be driven by expanded or newer training datasets. Across multiple benchmarks and modalities, Protenix-v1 improves over representative public baselines (e.g., Boltz-1, Chai-1, HF3, and prior Protenix releases), including protein–protein, antibody–antigen, and protein–ligand settings. This is shown across:

  • Protein–protein docking

  • Antibody–antigen interface prediction

  • Protein–ligand co-folding

FoldBench-Corrected Protenix-v1 pre-print

The paper also reports several concrete improvements relative to AlphaFold3 across key structural tasks. For example, Protenix-v1 achieves a 52.31% DockQ success rate for antibody–antigen interfaces, compared to 48.75% reported for AlphaFold3. For protein–protein docking, Protenix-v1 reports 72.70% DockQ success rate versus 71.73% for AlphaFold3. Improvements are also observed for protein–RNA complexes, where Protenix-v1 reaches 68.46% DockQ success rate compared to 65.22%, and for RNA monomer prediction, where Protenix-v1 reports 0.6547 lDDT versus 0.6140. Performance is similar in protein–ligand prediction, where AlphaFold3 reports 62.59% DockQ success rate compared to 62.54% for Protenix-v1, although Protenix-v1 slightly exceeds this when using four inference seeds. One area where AlphaFold3 remains stronger is protein–DNA docking, where it reports 75.91% DockQ success rate compared to 69.13% for Protenix-v1.

Figure 2 PXM Main Results. Performance across distinct PXM subsets are listed. ‘Median’ denotes the median performance across 5×5 sampled structures (solid filled bars), while ‘Selected’ denotes the performance of structures selected by confidence score (outlined bars with no fill).

These results are reported using the expanded PXM-22to25 benchmark suites, which increase target coverage and reduce sensitivity to single-year dataset effects, providing a more stable basis for cross-model comparison in practical drug discovery settings.

Table 5 PXM-22to25-Antibody and PXM-22to25-Ligand Results. ∗ denotes that this evaluation dataset contains part of Boltz-2 training data. antibody subset results are averaged across 498 complexes with 359 clusters, while ligand subset results are averaged across 625 complexes with 252 clusters, after intersection with Boltz-1 and Chai-1.

1) Inference-time scaling behavior

One of the most relevant aspects for DMTA cycles is the documented inference-time scaling behavior. The paper shows that increasing the sampling budget leads to consistent improvements in DockQ success rate and lDDT, particularly for antibody–antigen complexes.

Figure 1 Evaluation Results. (b) Protenix-v1 exhibits robust inference-time scaling on PXM-22to25-Antibody and FoldBench subsets. On PXM-22to25-Antibody, Protenix-v1 results are bootstrapped from 100 seeds.

This establishes a predictable compute–accuracy trade-off. For difficult targets or ambiguous interfaces, teams can allocate additional sampling budget and obtain measurable improvements rather than relying on single-run stochastic outputs. In a program context, that makes structural evaluation more controllable and less dependent on chance variation between runs.

2) Year-stratified and expanded benchmarks

The authors introduce year-based evaluation suites (PXM-2024, PXM-2025, PXM-22to25-Antibody, PXM-22to25-Ligand) to reduce dataset bias and increase statistical power. Antibody–antigen evaluation is particularly sensitive to cluster sparsity and inconsistent subset reporting. The expanded antibody benchmark and bootstrapped variance reporting address this directly.

Figure 4 Antibody Test Sets. Existing antibody-antigen tests yield inconsistent conclusions.

3) Practical feature parity

Beyond benchmark performance, Protenix-v1 includes components relevant for applied co-folding workloads and brings its inputs and training data construction closer to AlphaFold3-style pipelines while remaining fully open-source. Protenix-v1 integrates:

  • Protein template features

  • RNA MSA support

  • Expanded disorder-focused distillation

  • Large-scale monomer distillation (MGnify-based)

Together, these additions result in a model configuration that can leverage template information, RNA sequence context, and distillation-derived structural signal to improve robustness across diverse biomolecular inputs, without relying on proprietary components.

A second variant for applied settings

In addition to the strict cutoff model used for controlled comparison, the authors release Protenix-v1-20250630, trained on a more recent dataset to improve performance on newly released targets. This separation between benchmark-aligned and application-oriented variants reflects a practical distinction relevant for drug discovery programs: controlled evaluation versus maximum performance on contemporary structural space.

What this enables inside ApherisFold

With Protenix-v1 now available alongside OpenFold3 and Boltz-2, teams can:

  • Benchmark models side-by-side on proprietary targets

  • Re-run the same evaluation setup as new checkpoints are released

  • Assess applicability across specific chemotypes and interface classes

  • Integrate predictions directly into generative design, screening, and prioritization workflows

  • Inspect and compare structural outputs within the same environment

For organizations operating under tight DMTA timelines, the objective is not simply higher benchmark scores. It is the ability to generate structure-informed decisions that are reproducible, comparable across model versions, and defensible when they influence compound selection or make/no-make calls. Protenix-v1 expands the model portfolio in ApherisFold with a high-performing open-source option that can now be evaluated and deployed under full IP control within real drug programs.

Access ApherisFold

For teams who want to build a co-folding capability into their discovery engine, ApherisFold provides an enterprise-ready way to run models like Protenix-v1, OpenFold3 and Boltz-2 locally, benchmark them on proprietary structures, and operate fine-tuning workflows across programs as a standard part of R&D.
ApherisFold

Co-Folding AI
AI Drug Discovery
Pharma
Share blog post to

Insights delivered to your inbox monthly

Related Posts

Fine-tuning OpenFold3 on a small set of structures: The PDE10A case study
We fine-tuned OpenFold3 on just 10 PDE10A protein–ligand complexes and evaluated on 17 held-out structures. Even this low-n setup corrected systematic pose errors and improved interface metrics, making predictions more usable for design decisions.
AI Drug Discovery
Co-Folding AI
Machine Learning
When does protein–ligand co-folding become useful in real medicinal chemistry?
Using a recent J. Med. Chem. TrmD study, we evaluate how well OpenFold3 protein–ligand co-folding recovers binding modes and key interactions, and where it can realistically support medicinal chemistry decisions.
AI Drug Discovery
Co-Folding AI
Pharma
Apheris Launches ApherisFold to Make OpenFold3 Securely Usable in Pharma Environments
Apheris, a leading provider of AI applications for drug discovery, today announced the launch of ApherisFold, an enterprise software product that enables pharmaceutical organizations to securely run, benchmark, and fine-tune the latest co-folding models, including OpenFold3 and Boltz-2, directly within their own IT environments.
Co-Folding AI
Product Release