Skip to content

External Async Wrapper on Kubernetes🔗

Use this guide when the Hub runs on Kubernetes and must submit inference jobs to a customer-owned async wrapper, which then calls an Apheris model wrapper.

Hub -> customer async wrapper -> Apheris model wrapper

If the Hub talks directly to an Apheris model wrapper, use admission submission mode.

If the Hub talks to a customer-owned async wrapper that owns queueing and execution, use fire_and_forget submission mode.

For one example implementation of the customer wrapper, see the External Async Wrapper reference implementation.

Quick setup🔗

1. Point Hub to the customer wrapper🔗

The Hub must resolve the model to the customer wrapper URL, not to the downstream Apheris wrapper URL. Hub also calls GET /weights and GET /schema on that same discovered target.

Discovery example:

version: "modeldiscovery/2.1.0"
data:
  - model: openfold3
    url: http://openfold3-async-wrapper.apheris-hub.svc.cluster.local:8080
    submissionMode: fire_and_forget

Helm values example:

models:
  instances:
    openfold3-external:
      id: openfold3-external
      model: openfold3
      url: http://openfold3-async-wrapper.apheris-hub.svc.cluster.local:8080
      submissionMode: fire_and_forget

2. Make network paths work🔗

Required connectivity:

  • Hub pod -> customer wrapper service
  • Customer wrapper -> downstream Apheris wrapper service

3. Make inputPath valid downstream🔗

Recommended:

  • Mount the same PVC into the Hub and downstream wrapper so the same inputPath works in both pods.

Only use path rewriting if the customer wrapper explicitly owns path translation.

Do not forward inputPath blindly unless both pods resolve it the same way.

4. Persist tickets durably🔗

The customer wrapper must persist ticket state across pod restarts.

At minimum, keep:

  • Hub-facing uuid
  • downstream jobId
  • current status
  • last error, if any

5. Optional Hub outbound headers🔗

If the Hub must send headers to the customer wrapper, configure them in the Hub config file:

hub:
  targetHeaders:
    headers:
      - name: Authorization
        valueFromEnv: ASYNC_WRAPPER_AUTH_HEADER

These headers are used by Hub for GET /weights, GET /schema, POST /ticket, GET /ticket/{uuid}, and GET /results/{uuid}.

Go-live checklist🔗

Run through this list before handing the setup to end users:

  • Discovery points to the customer wrapper URL.
  • Discovery uses submissionMode: fire_and_forget.
  • GET /weights works on the customer wrapper URL.
  • GET /schema works on the customer wrapper URL.
  • Hub can reach the customer wrapper service.
  • Customer wrapper can reach the downstream Apheris wrapper service.
  • inputPath is valid in the downstream wrapper pod.
  • POST /ticket returns 200 or 202 with a non-empty uuid.
  • GET /ticket/{uuid} returns only PENDING, RUNNING, COMPLETED, or FAILED.
  • FAILED_* may be used internally by the customer wrapper, but Hub-facing terminal status is FAILED.
  • GET /results/{uuid} returns a complete results.predictions array.
  • GET /weights and GET /schema still work if the downstream execution wrapper is temporarily down, if that is part of your operating model.
  • Customer wrapper persists ticket state across pod restarts.

For exact payload shapes and example responses, see External Async Wrapper reference implementation.