Skip to content

Data Artifacts API Documentation🔗

Purpose and Usage🔗

The Data Artifacts API provides a standardized way to manage Data Artifacts and their associated files within Apheris. Data Artifacts represent metadata about files generated during job execution. Common Data Artifacts and their uses include:

  • Checkpoints: Saving the state of a model during training to resume from interruptions or for later evaluation.
  • Metrics: Storing performance measurements and evaluation results for model comparison and analysis.
  • Logs: Capturing execution logs and debugging information for audit purposes.
  • Results: Saving final outputs and generated data for downstream processing or visualization.

Additionally, Data Artifacts can be shared between different stages of a workflow.

Relevant Workflows🔗

The following example demonstrates the end-to-end process of managing a Data Artifact using the API. It covers creating the Data Artifact, uploading a file, and then downloading and verifying it. This script assumes you have the necessary authentication token and the base URL for the Orchestrator API.

Complete Data Artifact Management Workflow🔗

  1. Create a Data Artifact metadata record:

    POST https://<Orchestrator API URL>/v1/artifacts
    

    This establishes the metadata for your Data Artifact, including its type, name, and associated job.

  2. Add a file to the Data Artifact:

    PUT https://<Orchestrator API URL>/v1/artifacts/{id}/file/{key}
    

    This creates a file record and returns a pre-signed URL for uploading the actual file content.

  3. Upload the file content: Use the pre-signed URL returned in step 2 to upload the file content directly to the storage service.

    # Example using requests library
    with open('model.pt', 'rb') as file:
        requests.put(upload_info['url'], data=file, headers=upload_info['headers'])
    
  4. Mark the file upload as completed:

    PATCH https://<Orchestrator API URL>/v1/artifacts/{id}/file/{key}
    

    This updates the file status to "completed" after the upload is finished.

  5. Retrieve the Data Artifact metadata:

    GET https://<Orchestrator API URL>/v1/artifacts/{id}
    

    This returns the Data Artifact metadata along with information about all associated files.

  6. Get the download URL for a file:

    GET https://<Orchestrator API URL>/v1/artifacts/{id}/file/{key}
    

    This returns a pre-signed URL for downloading the file.

  7. Download the file and verify its integrity: Use the pre-signed URL returned in step 6 to download the file content. If you provided an MD5 hash when uploading the file, you can verify the downloaded file's integrity by comparing hashes.

    # Example using requests library
    response = requests.get(download_info['url'])
    with open('downloaded_model.pt', 'wb') as file:
        file.write(response.content)
    
    # Verify file integrity using MD5 hash
    if 'hash' in download_info:
        import hashlib
    
        # Calculate MD5 of downloaded file
        md5_hash = hashlib.md5()
        with open('downloaded_model.pt', 'rb') as f:
            for chunk in iter(lambda: f.read(4096), b""):
                md5_hash.update(chunk)
        calculated_hash = md5_hash.hexdigest()
    
        # Compare with hash from server
        if calculated_hash == download_info['hash']:
            print("File integrity verified: MD5 hash matches")
        else:
            print("Warning: File may be corrupted, MD5 hash doesn't match")
    

Authentication and Permission Requirements🔗

Authentication Method🔗

All API endpoints require authentication using a JWT token. The token must be included in the Authorization header of each request using the Bearer scheme:

Authorization: Bearer <token>

Token Authentication🔗

The Data Artifacts API uses authentication tokens:

  1. Computation Tokens: Generated automatically for computation pods

    • Contains a computeSpecID claim that identifies the computation
    • Limited to accessing Data Artifacts related to the specific computation
    • Generated using HMAC-SHA256 signing (algorithm: HS256)

Token Generation and Mounting for Computation Pods🔗

For computation pods, the authentication token is automatically generated and mounted as an environment variable when a Compute Spec is activated:

  1. When a Compute Spec is activated, a JWT token containing the computeSpecID claim is generated using HMAC-SHA256 signing.
  2. The token is mounted as an environment variable named APH_AUTH_DATA_ARTIFACT_TOKEN in the computation container.
  3. Applications running in the container can use this token in the Authorization header when making requests to the Data Artifacts API.

Using the Token in Your Application🔗

Python example:

import os
import requests

# Read the token from the environment variable
token = os.environ.get("APH_AUTH_DATA_ARTIFACT_TOKEN")

# Prepare your data payload
payload = {
    "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc",
    "type": "checkpoint",
    "name": "model-checkpoint-1",
    "metadata": {
        "epoch": 10,
        "accuracy": 0.95
    },
    "createdBy": {
        "user": "user1"
    }
}

# Use the token in your HTTP requests
headers = {"Authorization": f"Bearer {token}"}
response = requests.post("https://<Orchestrator API URL>/v1/artifacts", json=payload, headers=headers)

Permission Scopes🔗

  • Read: Ability to retrieve artifact metadata and download files
  • Write: Ability to create artifacts and upload files

Organization-Based Access Controls🔗

The Data Artifacts API enforces strict organization-based access control. The fundamental rule is that only the organization that initiates a computation can access the artifacts generated by it.

This means that users can only perform operations - such as reading metadata, adding files, or downloading them - on artifacts created by their own organization's computations. This principle is especially important in multi-organization computations where one entity initiates the process while others only provide data.

For example, consider a scenario with three organizations:

  • Organization A: Initiates a computation that produces artifacts.
  • Organization B: Supplies data for the computation.
  • Organization C: Supplies data for the computation.

In this case, the access permissions are as follows:

  • Organization A: Can access the artifacts.
  • Organization B: Cannot access the artifacts.
  • Organization C: Cannot access the artifacts.

This access model guarantees data isolation and prevents unauthorized access to sensitive information across organizational boundaries.

Request/Response Format with Examples🔗

The Data Artifacts API uses standard HTTP methods and JSON for request and response bodies. All endpoints return appropriate HTTP status codes and structured JSON responses.

API Endpoints Overview🔗

Method Endpoint Description
POST /v1/artifacts Create a new artifact metadata record
GET /v1/artifacts/{id} Retrieve an artifact by ID
PUT /v1/artifacts/{id}/file/{key} Add a file to an artifact
GET /v1/artifacts/{id}/file/{key} Get download information for an artifact file
PATCH /v1/artifacts/{id}/file/{key} Mark an artifact file upload as completed

Create Artifact🔗

Creates a new artifact metadata record.

Endpoint: POST /v1/artifacts

Request Format:

  • Content-Type: application/json
  • Authorization: Bearer token required

Request Body Example:

{
  "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc",
  "type": "checkpoint",
  "name": "model-checkpoint-1",
  "metadata": {
    "epoch": 10,
    "accuracy": 0.95
  },
  "createdBy": {
    "user": "user1"
  },
  "status": "active"
}

Request Parameters:

  • jobID (UUID, required): ID of the job the artifact belongs to.
  • type (string, required): Artifact type. Allowed values: checkpoint, metric, log, result.
  • name (string, required): Human-readable name of the artifact.
  • metadata (object, optional): Arbitrary user-supplied metadata.
  • createdBy (object, required): JSON object that identifies the creator (e.g. user, service).
  • status (string, optional): Optional status for the artifact.

Response Format:

  • Content-Type: application/json
  • Status Code: 201 Created (success)

Response Example:

{
  "id": "c0fd87ef-20ee-430f-ab33-3369bac2cf97"
}

Status Codes:

  • 201 Created: Artifact created successfully.
  • 400 Bad Request: Invalid request or artifact.
  • 401 Unauthorized: Missing/invalid auth headers.
  • 404 Not Found: Job not found.
  • 500 Internal Server Error: Persistence or unexpected error.

Get Artifact by ID🔗

Retrieves an artifact metadata by ID.

Endpoint: GET /v1/artifacts/{id}

Path Parameters:

  • id (UUID, required): Artifact ID.

Request Format:

  • Authorization: Bearer token required
  • No request body needed

Response Format:

  • Content-Type: application/json
  • Status Code: 200 OK (success)

Response Example:

{
  "ID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
  "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc",
  "type": "checkpoint",
  "name": "model-checkpoint-1",
  "metadata": {
    "epoch": 10,
    "accuracy": 0.95
  },
  "createdBy": {
    "user": "user1"
  },
  "createdAt": "2023-01-01T12:00:00Z",
  "updatedAt": "2023-01-01T12:00:00Z",
  "status": "active",
  "files": [
    {
      "id": "d1fd87ef-20ee-430f-ab33-3369bac2cf98",
      "artifactID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
      "name": "model.pt",
      "s3Path": "artifacts/c0fd87ef-20ee-430f-ab33-3369bac2cf97/model.pt",
      "hash": "md5hash==",
      "status": "completed"
    }
  ]
}

Status Codes:

  • 200 OK: Artifact retrieved successfully.
  • 400 Bad Request: Invalid request.
  • 401 Unauthorized: Missing/invalid auth headers.
  • 404 Not Found: Artifact not found.
  • 500 Internal Server Error: Unexpected error.

Add Artifact File🔗

Creates a new file record that belongs to an artifact and returns a pre-signed upload URL.

Endpoint: PUT /v1/artifacts/{id}/file/{key}

Path Parameters:

  • id (UUID, required): Artifact ID.
  • key (string, required): File key / object name.

Request Format:

  • Content-Type: application/json
  • Authorization: Bearer token required

Request Body Example:

{
  "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc",
  "payloadChecksum": "md5hash=="
}

Request Parameters:

  • jobID (UUID, required): ID of the job the artifact belongs to.
  • payloadChecksum (string, optional): Optional MD5 checksum of the file payload. When provided, this value is stored in the backend and can be used to verify the integrity of files downloaded from S3. Users can compare this checksum with the hash of the downloaded file to ensure the file wasn't corrupted during transfer.

    Note: Only MD5 hash format is supported. Here's an example of how to generate an MD5 hash for a file in Python:

    import hashlib
    
    def calculate_md5(file_path):
        """Calculate MD5 hash of a file."""
        md5_hash = hashlib.md5()
        with open(file_path, "rb") as f:
            # Read file in chunks to handle large files efficiently
            for chunk in iter(lambda: f.read(4096), b""):
                md5_hash.update(chunk)
        return md5_hash.hexdigest()
    
    # Example usage
    file_path = "model.pt"
    md5_hash = calculate_md5(file_path)
    print(f"MD5 hash: {md5_hash}")
    
    # Use this hash in your API request
    payload = {
        "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc",
        "payloadChecksum": md5_hash
    }
    

Response Format:

  • Content-Type: application/json
  • Status Code: 201 Created (success)

Response Example:

{
  "fileID": "d1fd87ef-20ee-430f-ab33-3369bac2cf98",
  "URL": "https://s3.amazonaws.com/bucket/path/to/file?presigned-params",
  "headers": {
    "Content-MD5": [
      "md5hash=="
    ],
    "Host": [
      "s3.eu-central-1.amazonaws.com"
    ],
    "X-Amz-Meta-jobID": [
      "9b50c3ba-2c81-44f8-87b8-d8760f91fedc"
    ],
    "X-Amz-Meta-Computespec_id": [
      "fff94995-7679-4e01-8fd1-61e26ae26d08"
    ]
  }
}

Status Codes:

  • 201 Created: File record created successfully.
  • 400 Bad Request: Invalid request (bad UUIDs, empty key, or malformed body).
  • 401 Unauthorized: Missing/invalid auth headers or authorization failed.
  • 404 Not Found: Artifact not found.
  • 500 Internal Server Error: Persistence, presign, or unexpected error.

Get Artifact File🔗

Retrieves download information for an artifact file.

Endpoint: GET /v1/artifacts/{id}/file/{key}

Path Parameters:

  • id (UUID, required): Artifact ID.
  • key (string, required): File key/name.

Request Format:

  • Authorization: Bearer token required
  • No request body needed

Response Format:

  • Content-Type: application/json
  • Status Code: 200 OK (success)

Response Example:

{
  "URL": "https://s3.amazonaws.com/bucket/path/to/file?presigned-params",
  "hash": "md5hash=="
}

Status Codes:

  • 200 OK: Download info retrieved successfully.
  • 400 Bad Request: Invalid request parameters or UUIDs.
  • 401 Unauthorized: Missing/invalid auth headers or authorization failed.
  • 404 Not Found: Artifact not found.
  • 409 Conflict: Upload still in progress.
  • 500 Internal Server Error: Service errors (presign failed, S3 path missing, persistence errors).

Update Artifact File Status🔗

Marks an artifact file upload as completed.

Endpoint: PATCH /v1/artifacts/{id}/file/{key}

Path Parameters:

  • id (UUID, required): Artifact ID.
  • key (string, required): File key returned when the upload was initiated.

Request Format:

  • Content-Type: application/json
  • Authorization: Bearer token required

Request Body Example:

{
  "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc"
}

Request Parameters:

  • jobID (UUID, required): ID of the job the artifact belongs to.

Response Format:

  • No content in the response body
  • Status Code: 200 OK (success)

Status Codes:

  • 200 OK: Status updated successfully.
  • 400 Bad Request: Invalid request body or status transition.
  • 401 Unauthorized: Missing/invalid authentication headers.
  • 404 Not Found: Artifact or file not found.
  • 409 Conflict: Status transition not allowed.
  • 500 Internal Server Error: Persistence error or unexpected server error.

Error Handling and Status Codes🔗

The Data Artifacts API uses standard HTTP status codes to indicate the success or failure of requests. All error responses include a JSON body with an error message to help diagnose the issue.

Error Response Format🔗

All error responses have the following JSON structure:

{
  "error": "error description"
}

Note: The error message in the response body will specify the reason for the failure (e.g., "Invalid UUID format for jobID" or "Missing required field: name") to help with debugging.

HTTP Status Codes🔗

Status Code Description Common Causes
200 OK Request succeeded Successful GET or PATCH operations
201 Created Resource created successfully Successful POST or PUT operations
400 Bad Request Invalid request Malformed JSON, invalid UUIDs, missing required fields
401 Unauthorized Authentication failed Missing or invalid token, expired token
403 Forbidden Permission denied Token lacks necessary permissions
404 Not Found Resource not found Invalid artifact ID, job ID, or file key
409 Conflict Resource conflict Upload still in progress, status transition not allowed
500 Internal Server Error Server-side error Database errors, S3 errors, unexpected exceptions

Common Error Scenarios and Handling🔗

Validation Errors (400 Bad Request)🔗

  • Invalid UUID format: Ensure all UUIDs are in the correct format (e.g., "9b50c3ba-2c81-44f8-87b8-d8760f91fedc").
  • Missing required fields: Check that all required fields are included in the request body.
  • Invalid artifact type: Ensure the artifact type is one of the allowed values: checkpoint, metric, log, result.
  • Empty file key: File keys must not be empty strings.

Authentication Errors (401 Unauthorized)🔗

  • Missing token: Ensure the Authorization header is included with the Bearer token.
  • Invalid token: Verify the token is correctly formatted and signed.
  • Token validation failure: The token may be expired or tampered with.

Resource Errors (404 Not Found)🔗

  • Artifact not found: Verify the artifact ID exists.
  • Job not found: Ensure the job ID exists and is accessible.
  • File not found: Check that the file key is correct.

Conflict Errors (409 Conflict)🔗

  • Upload in progress: Wait for the upload to complete before attempting to download.
  • Status transition not allowed: File status can only transition from "pending" to "completed".

Server Errors (500 Internal Server Error)🔗

  • Persistence errors: Database connection issues or constraints violations.
  • Presign failures: S3 service unavailable or configuration issues.
  • Unexpected errors: Contact support with the error message and request details.

Error Handling Best Practices🔗

  1. Retry with exponential backoff for 5xx errors, which may be temporary.
  2. Do not retry for 4xx errors without fixing the underlying issue.
  3. Log error responses for debugging purposes.
  4. Handle specific error codes in your application to provide appropriate user feedback.
  5. Include request IDs (if available in error responses) when reporting issues.

Data Models🔗

Artifact🔗

An artifact represents metadata about files generated during job execution.

{
  "ID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
  "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc",
  "type": "checkpoint",
  "name": "model-checkpoint-1",
  "metadata": {
    "epoch": 10,
    "accuracy": 0.95
  },
  "createdBy": {
    "user": "user1"
  },
  "createdAt": "2023-01-01T12:00:00Z",
  "updatedAt": "2023-01-01T12:00:00Z",
  "status": "active"
}

ArtifactFile🔗

A file associated with an artifact.

{
  "id": "d1fd87ef-20ee-430f-ab33-3369bac2cf98",
  "artifactID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
  "name": "model.pt",
  "s3Path": "artifacts/c0fd87ef-20ee-430f-ab33-3369bac2cf97/model.pt",
  "hash": "md5hash==",
  "status": "pending|completed"
}