Skip to content

Deploying on Kubernetes with Helm🔗

This guide outlines detailed, step-by-step instructions for the installation, upgrade, and management of the Apheris Compute Gateway via a Helm chart on Kubernetes environments.

Prerequisites🔗

Before proceeding with the installation or upgrade of the Apheris Compute Gateway, it's important to ensure your environment meets the following prerequisites.

Supported Kubernetes Versions🔗

The Apheris Compute Gateway is compatible with Kubernetes version 1.30 or newer.

Required Tools and Configurations🔗

  • Helm 3: You must have Helm version 3.7 or later installed to manage the Apheris Compute Gateway Helm chart.
  • kubectl: Ensure that kubectl is installed and properly configured to communicate with your Kubernetes cluster. Its version should be compatible with your Kubernetes minor version.
  • Cluster Role: Verify that your Kubernetes user has the necessary cluster role bindings to deploy and manage applications on Kubernetes.

Installing the Helm Chart🔗

The installation section covers the initial deployment of the Apheris Compute Gateway Helm chart onto your Kubernetes cluster.

  1. Set Kubernetes Context Ensure you're using the correct context by checking the available contexts:

    kubectl config get-contexts
    

    To switch to the desired context, execute:

    kubectl config use-context [CONTEXT_NAME]
    
  2. Log into the Apheris Helm Registry Authenticate with the Apheris Helm registry using the following command:

    helm registry login quay.io --username [HELM_REPO_USERNAME] --password [HELM_REPO_PASSWORD]
    

    * Replace [HELM_REPO_USERNAME] and [HELM_REPO_PASSWORD] with your Helm repository credentials.

    * Credentials can be retrieved from the provided Bitwarden link, shared by your Apheris representative.

      After successful login, you should see "Login Succeeded".
    
  3. Prepare the Values File Create a values.yaml file with your configuration settings. Below is a minimal example of what this file might include:

    tenant: "tenantId" # Replace with your Apheris tenant identifier
    auth:
      domain: auth.app.apheris.net
      orchestrator:
        clientId: "clientId" # Replace with the actual client ID
        clientSecret: "clientSecret" # Replace with the actual client Secret
    helmRepoUsername: "helmRepoUsername" # Replace with Helm repo username
    helmRepoPassword: "helmRepoPassword" # Replace with Helm repo password
    

    Ensure you replace tenantId, clientId, clientSecret, helmRepoUsername, and helmRepoPassword with the actual values. These can be found via the Bitwarden link.

    To explore the default values.yaml file and additional chart information, run:

    helm show all oci://quay.io/apheris/gateway-agent-chart --version [CHART_VERSION]
    

    Replace [CHART_VERSION] with the specific version of the chart you intend to deploy.

  4. Optional: Configure Access to an S3 bucket

    The Apheris Data Access Layer can access any bucket hosted on an S3 API compatible solution.

    You will need to grant Apheris Data Access Layer (DAL) access to the respective S3 bucket via the buckets IAM Policy.

    An example bucket policy for aws s3, assuming you are using IAM roles for service accounts, looks like:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": [
              "arn:aws:iam::ACCOUNT_ID:role/EXAMPLE_DAL_IAM_ROLE_NAME"
            ]
          },
          "Action": [
            "s3:ListBucket",
            "s3:GetObject"
          ],
          "Resource": [
            "arn:aws:s3:::EXAMPLE_BUCKET_NAME",
            "arn:aws:s3:::EXAMPLE_BUCKET_NAME/*"
          ]
        }
      ]
    }
    

    Note that the above example is AWS S3 specific, please refer to the documentation of the s3 solution you are using (for instance aws, minio, ...) for the actual policy document format.

    To have DAL use an s3 bucket as dataset source, it needs to be added the values for the helm release via the dal.sources.s3 list.

    An example values.yaml snippet looks like:

    dal:
      sources:
        s3:
          - EXAMPLE_BUCKET_NAME_1
          - EXAMPLE_BUCKET_NAME_2
          - ...
    
  5. Optional: Enable Asset Policy Signature Validation

    Please refer to the guide

  6. Install the Chart Deploy the chart with the prepared values.yaml file:

    helm upgrade --install apheris-gateway oci://quay.io/apheris/gateway-agent-chart \
      --namespace=apheris \
      --create-namespace \
      --wait \
      --version [CHART_VERSION] \
      --values values.yaml
    

    Replace [CHART_VERSION] with your chosen chart version.

  7. Verify Deployment Check if the pods are running correctly:

    kubectl -n apheris get pods
    

    You should see:

    * apheris-gateway-agent pod.

    * apheris-gatewat-dal pod.

Performing an Upgrade🔗

To upgrade the deployed Apheris Compute Gateway on your cluster, execute:

helm upgrade apheris-gateway oci://quay.io/apheris/gateway-agent-chart \
  --namespace=apheris \
  --wait \
  --version [NEW_CHART_VERSION] \
  --values values.yaml

Monitor the upgrade process and verify the deployment status to ensure everything is functioning as expected after the upgrade.

Uninstall🔗

To remove the deployed Apheris Compute Gateway from your cluster, apply below command:

helm uninstall apheris-gateway

This command will delete the Helm release and remove all associated resources from the specified namespace in your Kubernetes cluster.

Frequently Asked Questions (FAQs)🔗

Configuring Cilium Network Policies🔗

Q: How do I enable Cilium Network policies for the Apheris Compute Gateway Helm chart?

A: The Helm chart includes Cilium Network policies designed to restrict communications to only what's necessary. However, to leverage these policies, Cilium Custom Resource Definitions (CRDs) must already be present in your cluster. By default, these rules are not active to accommodate clusters without Cilium. To enable them, incorporate the following snippet into your values.yaml file:

cilium:
  enabled: true

Setting Up GPU Workloads🔗

Q: Can I run GPU workloads with the Apheris Compute Gateway, and how do I enable this feature?

A: Yes, the Apheris Compute Gateway supports GPU workloads. To enable this capability, you need to adjust your values.yaml file accordingly. Here is how you can do it:

job:
  gpu: true

Ensure your Kubernetes cluster has the necessary GPU resources and drivers installed to support these workloads.

Configuring HTTP Proxy🔗

Q: My environment requires an HTTP proxy. How can I configure the Apheris Compute Gateway to use it?

A: If your environment operates behind an HTTP proxy, you can configure the Apheris Compute Gateway to utilize this proxy by specifying the proxy settings in your values.yaml file. Here is the configuration you'll need to add:

proxy:
  enabled: true
  url: "PROXY_URL"

Replace "PROXY_URL" with the actual URL of your HTTP proxy. This configuration ensures that all outbound communications from the Apheris Compute Gateway will route through your specified HTTP proxy.

Enabling DAL storage🔗

The DAL can be enabled to store intermediate data written by computations. This data does not leave the gateway.

There are different storage options available for the DAL. The storage can be configured in the dal.persistence section of the helm values.

Existing Persistent Volume Claim🔗

You can specify an existing persistent volume claim (PVC) in the DAL storage settings. This allows the DAL to use a pre-defined storage resource, which can be particularly useful for integrating with existing infrastructure or managing storage independently of the Helm chart.

dal:
  persistence:
    enabled: true
    existingClaim: "my-existing-pvc"

S3 Storage Configuration🔗

If you are installing the S3 CSI driver, you can specify an S3 bucket for the DAL to use. This configuration is particularly useful for deployments running on EKS (Elastic Kubernetes Service), as it helps avoid availability zone (AZ) issues that can occur with EBS volumes used as persistent storage.

dal:
  persistence:
    enabled: true
    s3:
      bucket: "my-s3-bucket"