ApherisFold Application🔗

Welcome to the ApherisFold Application. This guide will walk you through the steps needed to deploy, configure, and begin using the application in your local or cloud environment.

Overview🔗

The ApherisFold Application allows you to predict the 3D structures of macromolecule complexes with an emphasis on protein–ligand, protein–protein, and antibody–antigen interfaces using OpenFold3 (OF3) and Boltz-2. The affinity prediction capabilities of Boltz-2 are retained.

Key capabilities include:

Quick and convenient deployment to your local cloud or on-premises environment - deploy via pre-built container, AWS CloudFormation Template, or build from source
Model inference (for both OF3 and Boltz-2) with input validation, batch execution programmatically or via a scientist-friendly GUI with built-in visualizations (pLDDT, PAE, structure viewers)

We will soon release additional support for data preparation, benchmarking and fine-tuning.

System Overview🔗

Apheris has created Docker images for its models, built on open-source versions and equipped with an HTTP API wrapper for easy integration with third-party systems.

Models and Requirements🔗

OpenFold3 and Boltz-2 are included in the initial release of the ApherisFold application. Both models are large and can require 25GB of disk space or more. These models have been packaged by Apheris as Docker images and can be freely downloaded. The images can be pulled in advance if pulling from our repository is not an option for your environment. Please contact us for assistance with this process if needed.

We target the latest release from the respective model providers. For the most up-to-date information about model versions, see the model description from within the Hub when installing a model.

Recommended Hardware:

Modern GPU with at least 48GB GPU memory and CUDA 11+ support (e.g. NVIDIA A100, H100, L40S, RTX 6000 and others). In AWS, the G6e instance is an example of a cost-effective machine that supports OpenFold3.
At least 300GB of disk space
Docker environment with Nvidia GPU drivers and the Nvidia Container Toolkit installed

Architecture🔗

When you select a model for inference in the Apheris Hub, a corresponding Docker container is launched. Initially, only the model container is started, and system resources, including the GPU, are not allocated until a query is submitted. Once a query is processed, the model starts in a subprocess, and the GPU resources are allocated. After the query completes, the GPU is released for future use.

You can submit multiple inference queries at the same time. However, because each query requires full access to the GPU, they are processed sequentially.

When a query is submitted, the input is provided in JSON format as specified in the Running Inference section below, and including any assets uploaded via the "assets" option. These are saved to the input folder specified during the deployment process (see Deployment Guides).

Once inference is complete, the model's predictions and any associated logs are written to the respective output folder and immediately viewable in the UI or accessible by your own external tooling.

Getting Started🔗

Deploy the Apheris Hub🔗

The ApherisFold Application is the first application available within the Apheris Hub and can be easily installed by following our Deployment Guides.

You can install the Hub in a few ways:

After installation, you can access the Apheris Hub from your web browser.

For step-by-step instructions, see the Troubleshooting Guide, which covers common operational problems and provides troubleshooting techniques to help identify, address, or confirm issues.

If you need help or encounter any issues, and the Hub is running, you can generate a Support ZIP archive directly from your web browser (Settings > Support > Download Support ZIP) or via the API. This archive includes logs and diagnostic information that will help our team assist you more efficiently. Please email the Support ZIP file, along with a description of your issue, to support@apheris.com.

Install Models🔗

The Apheris Hub allows you to download multiple models (currently OpenFold3 and Boltz-2) for the ApherisFold application. These models have been packaged by Apheris as Docker images and can be freely downloaded. They come with a standardized query payload and can be called via an HTTP API. See Running Inference for more information.

To install a model, navigate to the models screen and select a model and version from the install model list.

Mock Model🔗

You can also download a Mock Model, which is useful for installing the Hub locally and testing the full end-to-end workflow of the ApherisFold application.

The Mock Model:

Is pulled from the same repository as all other models
Only validates and works with the example queries
Produces results, similar to that of OpenFold3
Results can be visualized and downloaded the same as any other model

Managing Models🔗

Once you have installed a model, you can start and stop it via the model settings.

The settings page also displays model details including:

MSA server configuration
Container status and uptime
Container port

Uninstall Models🔗

In addition to starting and stopping, you can also uninstall the model from here by clicking the "Uninstall" button.

Multiple Sequence Alignment (MSA)🔗

MSAs for ApherisFold can be provided either via pre-computed .a3m files, or by using a compatible MSA server implementation.

The default behavior for models is to use user-supplied pre-generated MSA (.a3m) files with an option to use an existing MSA server.

ApherisFold supports ColabFold and Foldify servers, and these can be configured using the MSA page in the Apheris Hub. From this page you can specify the type of server, its URL and, in the case of Foldify, the API key.

For a limited time the Hub ships with free evaluation access to an Apheris-hosted Foldify server that can optionally be used. The Apheris-hosted public MSA server should be used only for evaluation and be treated as a public MSA API, if you'd like to host your own private Foldify server, please contact us.

Important

Any queries submitted using the Apheris-hosted Foldify server or public ColabFold versions of models will be sent to the MSA server hosted by, at least, ColabFold, Foldify or Apheris. If you do not wish to have your sequences leave your environment, use the default model and supply your pre-generated MSA files or use a self-deployed ColabFold or Foldify server.

Configurable MSA Server🔗

You can easily add your own ColabFold or Foldify MSA server via the MSA option.

To configure your server simply supply:

Server Name
Server Type: ColabFold or Foldify
Server URL
Number of Returned Sequences
API Key

MSA Usage🔗

Unless an MSA server is specifically enabled, ApherisFold will expect you to provide MSAs for protein sequences by uploading an a3m file. You can do this by clicking the "ADD MSA ASSET" button on the Query Builder, then selecting the .a3m from your file system. If using this method, the ApherisFold application will not reach out to any external servers. For each protein chain in the request, add an extra "msa": "filename.a3m" field (input validation will fail if the msa field is missing).

To enable an MSA server for your query, use the switch below the Query Builder entitled "Allow [MSA Server Name] usage". When activating this, you will receive a warning stating that your sequences will be processed by an external service. If accepted, this will not show again for the duration of your browser session.

Enabling this option will generate the MSAs for the request at inference time and ignore any supplied MSA files. You will also receive a prompt to confirm you want to do this and advise that doing so may result in sending requests to an external server.

See the Running Inference section for an example of using your own MSA files for inference.

Apheris can also help you set up a private MSA server. In that case, please contact us.

Running Inference🔗

Running inference can be done via the UI or the Hub API. Here, we cover running inference using the UI.

When you first click on Inference, you will need to choose a model to start. This is because the schema validation for the model parameters must first be obtained from that model, as this may differ between models and versions.

Query Builder🔗

It is possible to form an inference query using the graphical Query Builder and the JSON Queries Editor.

You can add to and adjust the query to fit your needs. It is possible to make changes as well as add additional components or chains to the query.

Here are some notable ways to configure a query:

Molecule Type: Select between Protein, Ligand, RNA, and DNA. There are some type-specific features such as the ability to use SMILES vs CCD code for ligands.
Copies: Adjusts the number of copies for the chain. A name is automatically created and usually does not need to be modified. If you'd like to adjust the name, this can be done in the JSON Queries tab.
Add Chain: Adds a chain to the query. If you need to delete the chain, hover the mouse to the top right. There must always be at least one chain per query.
Add Query: Add a new query to the request. Hover the mouse to the top right of the query to delete it. Delete will not appear if there is only one query as there must always be at least one query.

JSON Queries Editor🔗

The JSON Queries Editor view is considered an advanced view and it is recommended to instead use the Query Builder to form new queries. The queries made via the Query Builder are reflected in the JSON Editor view but not all changes in the JSON Editor will be reflected back to the Query Builder.

For advanced users, the JSON Editor offers many convenience features such as syntax highlighting, keyword auto-completion, and data type validation.

Tags🔗

Query names that are supplied in the JSON payload are automatically derived as tags for the request. These tags can later be used to search for specific inference requests.

Custom Tags🔗

You can supply your own tags by typing directly in the "Tags" field. These tags will appear with the results and can be searched for fast lookups across all queries and results. Custom tags are a great way to distinguish between requests that might have similar queries.

Examples🔗

The ApherisFold application comes with examples to help with getting started. Each of these examples can be run across all models. These can be used as starting templates for your queries or as a simple way to evaluate the full workflow.

The examples provide MSA files and do not need a MSA server to run. The examples can also be used with configured MSA servers, as described in Multiple Sequence Alignment (MSA)

Below we provide a bit more detail on each example:

Protein Ligand: The simple use-case of co-folding a small molecule with a single protein chain. The protein chain corresponds to the sequence of MCL1, and the small molecule is Aceclidine.
Protein-Protein Multimer: Hemoglobin is an example of a query consisting of a multimer of protein chains, with no ligand present.
Multi-Protein Ligand: A single query consisting of multiple sub-queries. In this case, MCL1 is co-folded with two different ligands (Aceclidine and 1-benzothiophene-2-carboxylic acid).
Monomer Protein: A single protein chain, Ubiquitin.
Homomer Protein: A dimer of identical protein chains, in this case the GCN4 Leucine Zipper.

Evaluation Examples🔗

The remaining examples illustrate how the ApherisFold platform can be used to evaluate models on reference structures. Each one is identified by a PDB ID (since the data in these examples is derived from the public domain) and is accompanied by a "{query_name}-ground-truth.cif" file attached as an asset.

Note

If you provide your own .cif files to serve as ground-truth, the name of the file should match the name of the query in your request.

After predicting the structure, the prediction is aligned to this reference structure, and several metrics of structural similarity are computed. The superimposed structures and metrics can then be viewed in the corresponding results page.

Assets and MSA files🔗

If your request requires additional assets, such as an MSA file, they can be supplied by clicking the Assets button or dropping in files.

Once you have uploaded your assets, they need to be included in your query by clicking the MSA Asset selection dropdown to choose from already added files or upload a new file. The JSON Editor will reflect this with the msa field.

If you want to set your request up as an evaluation of model performance on a known structure, make sure to upload the reference CIF file with the naming schema "{query_name}-ground-truth.cif".

Model Settings🔗

Some models have additional model-specific settings that can be configured such as diffusion samples. These can be changed on the Inference screen by selecting the "Model Settings" tab at the top.

You can click the "Defaults" button to see the available model settings and make changes.

Results Management🔗

Whatever you submit in the JSON payload (one or multiple queries) is submitted as a single request.

Submitting additional requests will place those requests in a queue. The GPU is fully consumed per request. Parallelism and multi-GPU support will come in follow-up releases.

All submitted requests and results can be viewed on the Results page.

Request Search🔗

You can also filter down results based on tags set for the queries or for metadata such as model name.

Request Status🔗

Submitted requests can have a few statuses:

Pending
Running
Failed
Done
- When completed, the status shows the total query runtime.

In addition to the request status, you can also see when the request was created, the model and version, and any tags associated with the request.

Results Deletion🔗

Request results are persisted in the output folder specified when you deployed the Apheris Hub for the first time.

Currently, there is no built-in way to delete these results. Instead, they are managed through the file system. This is an intentional measure to avoid any accidental deletion of valuable insights.

Analyzing Results🔗

The ApherisFold application comes pre-packaged with specialized analysis tools to support analyzing inference results within the UI.

You can also Download Results by clicking the raw results download button at the top of the results screen.

Each of these visual components serves a specific purpose in interpreting co-folding predictions:

3D viewer: Understand overall fold and domain organization
PAE/PDE: Assess inter-residue or inter-chain confidence
pLDDT plot: Gauge per-residue reliability
Sequence section: Review input-output fidelity and interpret ligand participation

3D Structure Viewer (Left Panel)🔗

This panel displays the 3D atomic structure of the predicted protein or protein-ligand complex. The cartoon representation emphasizes the secondary structure elements (helices, sheets, loops).

Viewer Color Coding🔗

The structure is colored based on per-residue pLDDT scores, which indicate the model’s confidence in the predicted atomic positions. The scale typically follows:

Blue: Very high confidence (pLDDT > 90)
Green: Confident (70–90)
Yellow/Orange: Low confidence (50–70)
Red: Very low confidence (< 50)

This can be helpful for visualizing local and global structure quality, identifying disordered or uncertain regions, and verifying expected fold topology.

PAE / PDE Matrix (Right Panel)🔗

The Predicted Aligned Error (PAE) or Predicted Distance Error (PDE) matrices visualize the model's expected error in positioning residue pairs relative to one another. This is essential for interpreting inter-domain and inter-chain interaction confidence and is especially relevant in multi-chain or protein-ligand complexes.

PDE: (specific to Boltz-2) is a measure of the uncertainty of the model in the distance between two residues/ligand atoms in the prediction, which is useful for uncertainty-aware modeling.
PAE: further captures the uncertainty of the model in the relative orientation, in addition to the distance, of pairs of residues/ligand atoms in the prediction. For further reading, there is an excellent visualization created by the European Molecular Biology Laboratory (EMBL) on this guide.

Both of these are most easily visualized as a heat map, where each axis corresponds to sequential Amino Acid/DNA/RNA residues or ligand atoms.

Matrix Color Coding🔗

Darker cells imply lower predicted error (higher confidence), while lighter regions suggest areas of structural uncertainty.

pLDDT Plot🔗

This line graph shows the pLDDT score per residue, providing a quick global overview of prediction confidence. It's ideal for identifying low-confidence loops or disordered regions and useful for downstream filtering (e.g., in docking or dynamics simulations).

X-axis: Residue index across all chains
Y-axis: pLDDT score (0–100)

Full-Screen Molecular Viewer🔗

The molecular viewer can be maximized to occupy most of the screen space for detailed structural inspection. In full-screen mode, the 3D viewer, PAE/PDE plot, and sequence bar are displayed together to provide a comprehensive view of the model. Each component can be minimized or expanded individually as needed, while the 3D viewer remains as the main background element for structure visualization.

See the video below for a demonstration of the full-screen view.

Chains Sequence Display🔗

This part is helpful for verifying sequence input/output consistency. Ligand info is key for those focused on drug binding or active site modeling.

Displays the full amino acid sequence used for the prediction.
Includes chain identifiers (A, B, C, etc.).

Ligand Representation (SMILES)🔗

If a ligand is present (as in co-folding), it is shown in SMILES format along with a 2D molecular rendering.

Guide: Benchmarking Using ApherisFold🔗

In a future release, ApherisFold will support interactive benchmarking directly within the Apheris Hub UI, allowing you to compare models and outputs side by side.

In the meantime, you can perform benchmarking on your data programmatically using the Apheris Hub API. This guide walks you through benchmarking OpenFold3 models, capturing metrics, and recording them locally using a short Python script. If you wish, you can adapt the approach to any language that supports HTTP requests.

Querying OpenFold3 Using the Hub API🔗

Before we begin, here's a small helper function for pretty-printing API responses:

import json
import requests

def pretty_print_dict(response: dict) -> None:
    print(json.dumps(response, indent=2))

Verifying Installed Models🔗

To check which models are installed and running in your Apheris Hub, run:

HUB_API_URL = "https://replace-me-with-your-url/api/v1/"
response = requests.get(HUB_API_URL + "applications/installed")
pretty_print_dict(response.json())

Example response:

{
  "data": {
    "openfold3": {
      "versions": {
        "openfold3:0.21.0": {
          "status": "started",
          "container": {
            "id": "9b2b44afb4ba84a69ca683300e2523d35261470a440f5cd7884859e06efeede6",
            "port": 7770,
            "name": "openfold3:0.21.0-OJai2U",
            "endpoint": "http://0.0.0.0:7770/predict",
            "state": "running",
            "status": "Up 1 weeks"
          },
          "message": "Container ID \"9b2b44af...\" status \"running\""
        }
      }
    }
  }
}

Success: OpenFold3 is installed and running!

Let’s move on to preparing input data.

Preparing Inputs🔗

To perform inference using ApherisFold via the Hub API, create JSON files specifying the protein sequences, ligand identifiers (SMILES), or cofactors/DNA/RNA to co-fold.

Example schemas can be found in the Apheris Hub documentation.

You can also generate queries directly from CIF files (PDB/mmCIF) to benchmark OpenFold3. Use the endpoint returned under endpoint in the previous API response:

payload = {
    "input_path": "CIF_FILE_DIRECTORY",
    "cif_file_names": ["5sfc", "5shh"],
}
response = requests.post(HUB_API_URL + "/query_from_mmcif", json=payload)
pretty_print_dict(response.json())

Example response (truncated for brevity):

{
  "queries": {
    "5sfc": {
      "chains": [
        { "molecule_type": "protein", "chain_ids": "1.A", "sequence": "..." },
        { "molecule_type": "ligand", "chain_ids": "1.G", "smiles": "..." }
      ]
    },
    "5shh": { ... }
  },
  "errors": {}
}

Container Volume Structure🔗

The ApherisFold container ingests files through a mounted host volume (/apheris/input). Your CIF directory must be located within this mount point. For example:

/apheris/input/
├── cif
│   └── directory
│       ├── 5sfc.cif
│       ├── 5shh.cif

If errors occur, failing file names will appear in the errors field with details. Each successfully extracted query appears under queries.

Adding Multiple Sequence Alignments (MSAs)🔗

OpenFold3 can run without MSAs, but predictions will be of lower quality. You’ll need to map protein sequences to MSA filenames and inject them into each query:

query_dict = response.json()["queries"]
for query_name, query in query_dict.items():
    for chain in query["chains"]:
        if chain["molecule_type"] == "protein":
            chain["msa"] = PROTEIN_SEQUENCE_TO_MSA_NAME[chain["sequence"]]

Building the Inference Payload🔗

Now, define your benchmark and construct the full payload:

YOUR_BENCHMARK_NAME = "example-benchmark"

payload = {
    "input_path": "CIF_FILE_DIRECTORY",
    "cif_file_names": ["5sfc", "5shh"],
}
response = requests.post(HUB_API_URL + "/query_from_mmcif", json=payload)
query_dict = response.json()["queries"]

for query_name, query in query_dict.items():
    for chain in query["chains"]:
        if chain["molecule_type"] == "protein":
            chain["msa"] = PROTEIN_SEQUENCE_TO_MSA_NAME[chain["sequence"]]

inference_payload = {
    "id": "1",
    "inputPath": f"{YOUR_BENCHMARK_NAME}/assets",
    "outputPath": f"{YOUR_BENCHMARK_NAME}/outputs",
    "requestParams": {"queries": query_dict},
    "modelParams": {"diffusion_samples": 1},
    "modelName": "openfold3",
    "modelVersion": "3.0.0"
}

Adding Ground-Truth Structures🔗

To enable metrics computation, copy your reference CIF files to:

/apheris/input/{YOUR_BENCHMARK_NAME}/assets/{query_name}-ground-truth.cif`

Example structure:

/apheris/input/
├── cif
│   └── directory
│       ├── 5sfc.cif
│       ├── 5shh.cif
├── example-benchmark
│   └── assets
│       ├── 5sfc-ground-truth.cif
│       ├── 5shh-ground-truth.cif

Example Python snippet:

import shutil

for query_name in response.json()["queries"].keys():
    shutil.copy(
        f"/apheris/input/{CIF_FILE_DIRECTORY}/{query_name}.cif",
        f"/apheris/input/{YOUR_BENCHMARK_NAME}/assets/{query_name}-ground-truth.cif"
    )

Skip this step if you’re using your own pipeline to compute alignment metrics.

Copy all MSA files into the same /assets directory:

/apheris/input/{YOUR_BENCHMARK_NAME}/assets/

Unused MSAs in this folder will not cause issues.

Performing Inference🔗

Run the inference request:

response = requests.post(HUB_API_URL + "/predict", json=inference_payload)

Your output directory will be populated with:

.heartbeat → updated every 10 seconds during inference
.error → present only if inference fails
.done → created when inference completes
meta/ → per-query JSON results with aligned CIF strings and metrics
results.zip → raw unaligned predicted structures

Monitoring Benchmark Status🔗

Here’s a simple Python loop to monitor your job:

import time
from pathlib import Path

output_dir = Path(f"/apheris/output/{YOUR_BENCHMARK_NAME}/outputs")

while True:
    if (output_dir / ".error").exists():
        print(f"Benchmark {YOUR_BENCHMARK_NAME} failed.")
        break
    if (output_dir / "results.zip").exists():
        print(f"Benchmark {YOUR_BENCHMARK_NAME} succeeded.")
        break
    time.sleep(10)

Note

Each inference payload should contain only a few queries. For large benchmarks, loop over multiple smaller payloads for better performance.

Summary🔗

You’ve now learned how to:

Query and verify OpenFold3 installations via the Hub API.
Generate benchmark queries from CIF files.
Prepare inference payloads and MSAs.
Run benchmarks and monitor outputs.

In future releases, these benchmarking workflows will be available interactively in the Apheris Hub UI.

Support and Next Steps🔗

To get access to source code, troubleshoot deployment issues, or inquire about connecting to federated environments, please contact: support@apheris.com.