Data Artifacts API🔗

Purpose and Usage🔗

The Data Artifacts API provides a standardized way to manage Data Artifacts and their associated files within Apheris. Data Artifacts represent metadata about files generated during job execution. Common Data Artifacts and their uses include:

Checkpoints: Saving the state of a model during training to resume from interruptions or for later evaluation.
Metrics: Storing performance measurements and evaluation results for model comparison and analysis.
Logs: Capturing execution logs and debugging information for audit purposes.
Results: Saving final outputs and generated data for downstream processing or visualization.

Data Artifacts support two access paths:

Compute access (HS256 compute token): Compute pods use /v1/artifacts/* with the APH_AUTH_DATA_ARTIFACT_TOKEN injected by the orchestrator. Access is granted based on the compute spec's organization or explicit access grants.
User access (Auth0 user token): The CLI and Governance Portal use Auth0 user tokens (apheris login) to manage access (/v1/artifacts/*/access) and to browse/download shared artifacts (/v1/shared/artifacts/*).

API Endpoints Overview🔗

Compute Token Endpoints (Create, Read, and Files)🔗

These endpoints are used from computation containers with the compute token:

Method	Endpoint	Description
POST	`/v1/artifacts`	Create a new artifact metadata record
GET	`/v1/artifacts/{id}`	Retrieve an artifact by ID
PUT	`/v1/artifacts/{id}/file/{key}`	Add a file to an artifact
GET	`/v1/artifacts/{id}/file/{key}`	Get download information for an artifact file
PATCH	`/v1/artifacts/{id}/file/{key}`	Mark an artifact file upload as completed

Access Management Endpoints (Grant/Revoke)🔗

These endpoints allow artifact owners (or org-level admins) to manage who can access their artifacts:

Method	Endpoint	Description
GET	`/v1/artifacts/access`	List artifacts you can share
POST	`/v1/artifacts/{id}/access`	Grant access to a user or organization
GET	`/v1/artifacts/{id}/access`	List access grants for an artifact
GET	`/v1/artifacts/{id}/access/{accessID}`	Get a specific access grant
DELETE	`/v1/artifacts/{id}/access/{accessID}`	Revoke an access grant

Shared Artifacts Endpoints (User Access)🔗

These endpoints are for accessing artifacts shared with you from the CLI or Governance Portal:

Method	Endpoint	Description
GET	`/v1/shared/artifacts`	List artifacts you have access to
GET	`/v1/shared/artifacts/{id}`	Get details of a shared artifact
GET	`/v1/shared/artifacts/{id}/file/{key}`	Get pre-signed URL to download a shared file

Relevant Workflows🔗

The following examples demonstrate the end-to-end process of managing Data Artifacts using the API. These workflows cover creating artifacts, uploading files, sharing with collaborators, and accessing shared artifacts. These examples assume you have the necessary authentication token and the base URL for the Orchestrator API.

Workflow 1: Creating and Uploading an Artifact (Artifact Owner)🔗

This workflow demonstrates the complete process of creating an artifact and uploading files to it:

Create a Data Artifact metadata record:
```
POST https://<Orchestrator API URL>/v1/artifacts
```
This establishes the metadata for your Data Artifact, including its type, name, and associated job.
Add a file to the Data Artifact:
```
PUT https://<Orchestrator API URL>/v1/artifacts/{id}/file/{key}
```
This creates a file record and returns a pre-signed URL for uploading the actual file content.

Upload the file content: Use the pre-signed URL returned in step 2 to upload the file content directly to the storage service.

# Example using requests library
with open('model.pt', 'rb') as file:
    requests.put(upload_info['url'], data=file, headers=upload_info['headers'])

Mark the file upload as completed:
```
PATCH https://<Orchestrator API URL>/v1/artifacts/{id}/file/{key}
```
This updates the file status to "completed" after the upload is finished.
Retrieve the Data Artifact metadata:
```
GET https://<Orchestrator API URL>/v1/artifacts/{id}
```
This returns the Data Artifact metadata along with information about all associated files.
Get the download URL for a file:
```
GET https://<Orchestrator API URL>/v1/artifacts/{id}/file/{key}
```
This returns a pre-signed URL for downloading the file.

Download the file and verify its integrity: Use the pre-signed URL returned in step 6 to download the file content. If you provided an MD5 hash when uploading the file, you can verify the downloaded file's integrity by comparing hashes.

# Example using requests library
response = requests.get(download_info['url'])
with open('downloaded_model.pt', 'wb') as file:
    file.write(response.content)

# Verify file integrity using MD5 hash
if 'hash' in download_info:
    import hashlib

    # Calculate MD5 of downloaded file
    md5_hash = hashlib.md5()
    with open('downloaded_model.pt', 'rb') as f:
        for chunk in iter(lambda: f.read(4096), b""):
            md5_hash.update(chunk)
    calculated_hash = md5_hash.hexdigest()

    # Compare with hash from server
    if calculated_hash == download_info['hash']:
        print("File integrity verified: MD5 hash matches")
    else:
        print("Warning: File may be corrupted, MD5 hash doesn't match")

This workflow shows how to grant access to artifacts you own, allowing collaborators to use them:

List your shareable artifacts:
```
GET https://<Orchestrator API URL>/v1/artifacts/access
```
Returns all artifacts created by your organization that can be shared.

Grant access to a user or organization:

POST https://<Orchestrator API URL>/v1/artifacts/{id}/access

Grant access by email or organization ID. Example request body:

{
  "recipientEmail": "collaborator@partner.com"
}

or

{
  "recipientOrgID": "org_abc123"
}

List access grants for an artifact:
```
GET https://<Orchestrator API URL>/v1/artifacts/{id}/access
```
View who has access to your artifact.

Revoke access when needed:

DELETE https://<Orchestrator API URL>/v1/artifacts/{id}/access/{accessID}

Workflow 3: Accessing Shared Artifacts (Artifact Consumer)🔗

This workflow demonstrates accessing artifacts that have been shared with you:

List accessible artifacts:
```
GET https://<Orchestrator API URL>/v1/shared/artifacts
```
Returns all artifacts you have access to (owned or granted).

Optional query parameters: - type: Filter by artifact type (checkpoint, metric, log, result)
Get artifact details:
```
GET https://<Orchestrator API URL>/v1/shared/artifacts/{id}
```
Retrieve metadata and file information for a specific artifact.
Download a file from the shared artifact:
```
GET https://<Orchestrator API URL>/v1/shared/artifacts/{id}/file/{key}
```
Returns a pre-signed URL for downloading the file.

Download the file content and verify integrity:

# Example using requests library
response = requests.get(download_info['url'])
with open('downloaded_model.pt', 'wb') as file:
    file.write(response.content)

# Verify file integrity using MD5 hash
if 'hash' in download_info:
    import hashlib
    md5_hash = hashlib.md5()
    with open('downloaded_model.pt', 'rb') as f:
        for chunk in iter(lambda: f.read(4096), b""):
            md5_hash.update(chunk)
    calculated_hash = md5_hash.hexdigest()

    if calculated_hash == download_info['hash']:
        print("File integrity verified: MD5 hash matches")
    else:
        print("Warning: File may be corrupted, MD5 hash doesn't match")

Authentication and Permission Requirements🔗

Authentication Method🔗

All API endpoints require authentication using a JWT token. The token must be included in the Authorization header of each request using the Bearer scheme:

Authorization: Bearer <token>

The Data Artifacts API uses two different types of authentication tokens depending on the endpoint and use case:

Token Type 1: Computation Tokens (For Compute Pods)🔗

Used for: Creating, uploading, and reading artifacts from compute containers (Workflow 1). Compute tokens can read artifacts that belong to the compute spec's organization or that have been explicitly shared with that org/user.

Endpoints that use Computation Tokens:

POST /v1/artifacts - Create artifact
PUT /v1/artifacts/{id}/file/{key} - Add file to artifact
GET /v1/artifacts/{id}/file/{key} - Get file download URL
PATCH /v1/artifacts/{id}/file/{key} - Complete file upload
GET /v1/artifacts/{id} - Get artifact by ID

How it works:

Computation tokens are automatically generated when a Compute Spec is activated
The token contains a computeSpecID claim that identifies the specific computation
The token is generated using HMAC-SHA256 signing (algorithm: HS256) and is not an Auth0 user token
The token is mounted as an environment variable APH_AUTH_DATA_ARTIFACT_TOKEN in the computation container
Access is evaluated against the compute spec's org/user identity (plus any explicit access grants)

Python example for use during computation:

import os
import requests

# Read the token from the environment variable
token = os.environ.get("APH_AUTH_DATA_ARTIFACT_TOKEN")

# Prepare your data payload
payload = {
    "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc",
    "type": "checkpoint",
    "name": "model-checkpoint-1",
    "metadata": {
        "epoch": 10,
        "accuracy": 0.95
    },
    "createdBy": {
        "user": "user1"
    }
}

# Use the token in your HTTP requests
headers = {"Authorization": f"Bearer {token}"}
response = requests.post("https://<Orchestrator API URL>/v1/artifacts", json=payload, headers=headers)

Used for: Managing artifact access, viewing shared artifacts, and downloading from the Governance Portal (Workflows 2 & 3)

Endpoints that use User Authentication Tokens:

GET /v1/artifacts/access - List shareable artifacts
POST /v1/artifacts/{id}/access - Grant access to artifact
GET /v1/artifacts/{id}/access - List access grants
GET /v1/artifacts/{id}/access/{accessID} - Get specific access grant
DELETE /v1/artifacts/{id}/access/{accessID} - Revoke access grant
GET /v1/shared/artifacts - List accessible artifacts
GET /v1/shared/artifacts/{id} - Get shared artifact details
GET /v1/shared/artifacts/{id}/file/{key} - Get shared file download URL

How it works:

User tokens are RS256 JWTs issued by Auth0 (CLI: apheris login, or the Governance Portal UI)
The token represents the authenticated user and their organization
Access is based on artifact ownership or explicit access grants
Org-level roles can manage access for artifacts owned by their organization; non-admin users can manage only the artifacts they created

Python example for use with user authentication:

import os
import requests

# Option 1: Token is automatically available if using apheris-cli or SDK after login
# The SDK handles authentication automatically

# Option 2: If making direct API calls, get token from your authentication flow
# (typically managed by the Apheris CLI or SDK)
token = "<your-user-token>"

# List artifacts you can share
headers = {"Authorization": f"Bearer {token}"}
response = requests.get(
    "https://<Orchestrator API URL>/v1/artifacts/access?page_size=50",
    headers=headers
)

# Grant access to a collaborator
grant_payload = {
    "recipientEmail": "collaborator@partner.com"
}
response = requests.post(
    "https://<Orchestrator API URL>/v1/artifacts/c0fd87ef-20ee-430f-ab33-3369bac2cf97/access",
    json=grant_payload,
    headers=headers
)

# List artifacts shared with you
response = requests.get(
    "https://<Orchestrator API URL>/v1/shared/artifacts",
    headers=headers
)

Best Practices:

During computation execution: Use the automatically provided APH_AUTH_DATA_ARTIFACT_TOKEN for creating, uploading, and reading artifacts
From CLI or Governance Portal: User authentication tokens are handled automatically by the Apheris CLI and web interface
For programmatic access: Use the Apheris Utils which manages authentication transparently
Never hard-code tokens: Always retrieve tokens from environment variables or secure configuration

Permission Scopes🔗

Read: Ability to retrieve artifact metadata and download files
Write: Ability to create artifacts and upload files

Organization-Based Access Controls🔗

The Data Artifacts API enforces access control based on two models:

Model 1: Ownership-Based Access (Computation Artifacts)🔗

By default, artifacts created during computations are accessible only to the organization that initiated the computation. This is the baseline model for compute-token access.

Users can only perform operations - such as reading metadata, adding files, or downloading them - on artifacts created by their own organization's computations, unless explicit access grants are in place (see Model 2). This principle is especially important in multi-organization computations where one entity initiates the process while others only provide data.

For example, consider a scenario with three organizations:

Organization A: Initiates a computation that produces artifacts.
Organization B: Supplies data for the computation.
Organization C: Supplies data for the computation.

In this case, the access permissions are as follows:

Organization A: Can access the artifacts.
Organization B: Cannot access the artifacts.
Organization C: Cannot access the artifacts.

Model 2: Grant-Based Access (Shared Artifacts)🔗

Artifact owners can explicitly grant access to their artifacts to:

Individual users (by email address)
Entire organizations (by organization ID)

When an access grant is created:

The recipient can view the artifact in their "shared artifacts" list (/v1/shared/artifacts)
They can download all files associated with that artifact
They can use the artifact as input in their computations
The original owner retains full control and can revoke access at any time

Access Grant Features:

Granular control: Grant access per artifact, not globally
Flexible recipients: Share with individual users or entire organizations
Audit trail: All grants include creation timestamps and creator information
Revocable: Access can be removed at any time by the artifact owner
Compute integration: Granted artifacts can be used in computations, not just downloaded

This dual-model approach ensures data isolation while enabling controlled collaboration.

Notes on org-level access:

Org-level access grants are honored in the Governance Portal and shared-artifacts endpoints only for users with org-level roles.
Compute-token access checks are based on the compute spec's organization and user identity plus any explicit grants.

Request/Response Format with Examples🔗

The Data Artifacts API uses standard HTTP methods and JSON for request and response bodies. All endpoints return appropriate HTTP status codes and structured JSON responses.

Create Artifact🔗

Creates a new artifact metadata record.

Endpoint: POST /v1/artifacts

Request Format:

Content-Type: application/json

Authorization: Bearer token required

Request Body Example:

{
  "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc",
  "type": "checkpoint",
  "name": "model-checkpoint-1",
  "metadata": {
    "epoch": 10,
    "accuracy": 0.95
  },
  "createdBy": {
    "user": "user1"
  },
  "status": "active"
}

Request Parameters:

jobID (UUID, required): ID of the job the artifact belongs to.

type (string, required): Artifact type. Allowed values: checkpoint, metric, log, result.

name (string, required): Human-readable name of the artifact.

metadata (object, optional): Arbitrary user-supplied metadata.

createdBy (object, required): JSON object that identifies the creator (e.g. user, service).

status (string, optional): Optional status for the artifact.

Response Format:

Content-Type: application/json

Status Code: 201 Created (success)

Response Example:

{
  "id": "c0fd87ef-20ee-430f-ab33-3369bac2cf97"
}

Status Codes:

201 Created: Artifact created successfully.

400 Bad Request: Invalid request or artifact.

401 Unauthorized: Missing/invalid auth headers.

404 Not Found: Job not found.

500 Internal Server Error: Persistence or unexpected error.

Get Artifact by ID🔗

Retrieves an artifact metadata by ID.

Endpoint: GET /v1/artifacts/{id}

Path Parameters:

id (UUID, required): Artifact ID.

Request Format:

Authorization: Bearer token required

No request body needed

Response Format:

Content-Type: application/json

Status Code: 200 OK (success)

Response Example:

{
  "ID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
  "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc",
  "type": "checkpoint",
  "name": "model-checkpoint-1",
  "metadata": {
    "epoch": 10,
    "accuracy": 0.95
  },
  "createdBy": {
    "user": "user1"
  },
  "createdAt": "2023-01-01T12:00:00Z",
  "updatedAt": "2023-01-01T12:00:00Z",
  "status": "active",
  "files": [
    {
      "id": "d1fd87ef-20ee-430f-ab33-3369bac2cf98",
      "artifactID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
      "name": "model.pt",
      "s3Path": "artifacts/c0fd87ef-20ee-430f-ab33-3369bac2cf97/model.pt",
      "hash": "md5hash==",
      "status": "completed"
    }
  ]
}

Status Codes:

200 OK: Artifact retrieved successfully.

400 Bad Request: Invalid request.

401 Unauthorized: Missing/invalid auth headers.

404 Not Found: Artifact not found.

500 Internal Server Error: Unexpected error.

Add Artifact File🔗

Creates a new file record that belongs to an artifact and returns a pre-signed upload URL.

Endpoint: PUT /v1/artifacts/{id}/file/{key}

Path Parameters:

id (UUID, required): Artifact ID.

key (string, required): File key / object name.

Request Format:

Content-Type: application/json

Authorization: Bearer token required

Request Body Example:

{
  "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc",
  "payloadChecksum": "5d41402abc4b2a76b9719d911017c592"
}

Request Parameters:

jobID (UUID, required): ID of the job the artifact belongs to.

payloadChecksum (string, optional): Optional MD5 checksum of the file payload. For this field, the checksum must be provided as a 32-character (lowercase) hex string. When provided, this value is stored in the backend and can be used to verify the integrity of files downloaded from S3. Users can compare this checksum with the hash of the downloaded file to ensure the file wasn't corrupted during transfer.

Note: Only MD5 hash format is supported for this field, and requests must use the 32-character hex representation (e.g. as produced by hexdigest() in most libraries). Internally, or in other API responses, the same MD5 value may be represented as a base64-encoded checksum, but this does not change the required hex format for payloadChecksum in this request. Here's an example of how to generate an MD5 hash for a file in Python:
```
import hashlib

def calculate_md5(file_path):
    """Calculate MD5 hash of a file."""
    md5_hash = hashlib.md5()
    with open(file_path, "rb") as f:
        # Read file in chunks to handle large files efficiently
        for chunk in iter(lambda: f.read(4096), b""):
            md5_hash.update(chunk)
    return md5_hash.hexdigest()

# Example usage
file_path = "model.pt"
md5_hash = calculate_md5(file_path)
print(f"MD5 hash: {md5_hash}")

# Use this hash in your API request
payload = {
    "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc",
    "payloadChecksum": md5_hash
}
```

Response Format:

Content-Type: application/json

Status Code: 201 Created (success)

Response Example:

{
  "fileID": "d1fd87ef-20ee-430f-ab33-3369bac2cf98",
  "url": "https://s3.amazonaws.com/bucket/path/to/file?presigned-params",
  "headers": {
    "Content-MD5": [
      "md5hash=="
    ],
    "Host": [
      "s3.eu-central-1.amazonaws.com"
    ],
    "X-Amz-Meta-jobID": [
      "9b50c3ba-2c81-44f8-87b8-d8760f91fedc"
    ],
    "X-Amz-Meta-Computespec_id": [
      "fff94995-7679-4e01-8fd1-61e26ae26d08"
    ]
  }
}

Status Codes:

201 Created: File record created successfully.

400 Bad Request: Invalid request (bad UUIDs, empty key, or malformed body).

401 Unauthorized: Missing/invalid auth headers or authorization failed.

404 Not Found: Artifact not found.

500 Internal Server Error: Persistence, presign, or unexpected error.

Get Artifact File🔗

Retrieves download information for an artifact file.

Endpoint: GET /v1/artifacts/{id}/file/{key}

Path Parameters:

id (UUID, required): Artifact ID.

key (string, required): File key/name.

Request Format:

Authorization: Bearer token required

No request body needed

Response Format:

Content-Type: application/json

Status Code: 200 OK (success)

Response Example:

{
  "url": "https://s3.amazonaws.com/bucket/path/to/file?presigned-params",
  "hash": "5d41402abc4b2a76b9719d911017c592"
}

Status Codes:

200 OK: Download info retrieved successfully.

400 Bad Request: Invalid request parameters or UUIDs.

401 Unauthorized: Missing/invalid auth headers or authorization failed.

404 Not Found: Artifact not found.

409 Conflict: Upload still in progress.

500 Internal Server Error: Service errors (presign failed, S3 path missing, persistence errors).

Update Artifact File Status🔗

Marks an artifact file upload as completed.

Endpoint: PATCH /v1/artifacts/{id}/file/{key}

Path Parameters:

id (UUID, required): Artifact ID.

key (string, required): File key returned when the upload was initiated.

Request Format:

Content-Type: application/json

Authorization: Bearer token required

Request Body Example:

{
  "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc"
}

Request Parameters:

jobID (UUID, required): ID of the job the artifact belongs to.

Response Format:

No content in the response body

Status Code: 200 OK (success)

Status Codes:

200 OK: Status updated successfully.

400 Bad Request: Invalid request body or status transition.

401 Unauthorized: Missing/invalid authentication headers.

404 Not Found: Artifact or file not found.

409 Conflict: Status transition not allowed.

500 Internal Server Error: Persistence error or unexpected server error.

Access Management Endpoints🔗

These endpoints allow artifact owners to manage who can access their artifacts.

List Shareable Artifacts🔗

Returns artifacts created by the current user's organization that can be shared with others. Org-level admins see all artifacts in the organization; non-admin users see only artifacts they created.

Endpoint: GET /v1/artifacts/access

Query Parameters:

page_size (integer, optional): Number of items to return. Default: 50. Minimum: 1.
type (string, optional): Filter by artifact type. Allowed values: checkpoint, metric, log, result.

Request Format:

Authorization: Bearer token required
No request body needed

Response Format:

Content-Type: application/json
Status Code: 200 OK (success)

Response Example:

{
  "artifacts": [
    {
      "ID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
      "name": "model-checkpoint-epoch-10",
      "type": "checkpoint",
      "createdAt": "2024-01-15T10:30:00Z",
      "ownerCreatedBy": {
        "email": "jane@example.com",
        "givenName": "Jane",
        "familyName": "Doe"
      },
      "filesCount": 3
    }
  ]
}

Status Codes:

200 OK: Artifacts retrieved successfully.
400 Bad Request: Invalid query parameters.
401 Unauthorized: Missing/invalid authentication headers.
500 Internal Server Error: Persistence or unexpected error.

Grant Artifact Access🔗

Grant access to an artifact for a specific user (by email) or organization (by organization ID).

Endpoint: POST /v1/artifacts/{id}/access

Path Parameters:

id (UUID, required): Artifact ID.

Request Format:

Content-Type: application/json
Authorization: Bearer token required

Request Body Examples:

Grant access to a user by email:

{
  "recipientEmail": "collaborator@partner.com"
}

Grant access to an organization:

{
  "recipientOrgID": "org_abc123xyz"
}

Request Parameters:

recipientEmail (string, optional): Email address that should get access. Mutually exclusive with recipientOrgID.
recipientOrgID (string, optional): Organization ID that should get access. Mutually exclusive with recipientEmail.

Note: Exactly one of recipientEmail or recipientOrgID must be provided.

Response Format:

Content-Type: application/json
Status Code: 201 Created (success)

Response Example:

{
  "id": "f3g4h5i6-j7k8-9012-l3m4-n5o6p7q8r9s0",
  "artifactID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
  "recipientEmail": "collaborator@partner.com",
  "recipientOrgID": null,
  "createdAt": "2024-01-15T11:00:00Z",
  "createdBy": {
    "email": "jane@example.com",
    "orgID": "org_owner123"
  }
}

Status Codes:

201 Created: Access granted successfully.
400 Bad Request: Invalid request (missing recipient, invalid UUIDs).
401 Unauthorized: Missing/invalid authentication headers or not authorized to manage this artifact.
404 Not Found: Artifact not found.
409 Conflict: Access grant already exists for this recipient.
500 Internal Server Error: Persistence or unexpected error.

List Access Grants🔗

List all access grants for a specific artifact.

Endpoint: GET /v1/artifacts/{id}/access

Path Parameters:

id (UUID, required): Artifact ID.

Request Format:

Authorization: Bearer token required
No request body needed

Response Format:

Content-Type: application/json
Status Code: 200 OK (success)

Response Example:

{
  "grants": [
    {
      "id": "f3g4h5i6-j7k8-9012-l3m4-n5o6p7q8r9s0",
      "artifactID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
      "recipientEmail": "collaborator@partner.com",
      "recipientOrgID": null,
      "createdAt": "2024-01-15T11:00:00Z"
    },
    {
      "id": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
      "artifactID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
      "recipientEmail": null,
      "recipientOrgID": "org_partner456",
      "createdAt": "2024-01-15T12:30:00Z"
    }
  ]
}

Status Codes:

200 OK: Grants retrieved successfully.
400 Bad Request: Invalid artifact ID.
401 Unauthorized: Missing/invalid authentication headers or not authorized to view grants.
404 Not Found: Artifact not found.
500 Internal Server Error: Persistence or unexpected error.

Revoke Access Grant🔗

Revoke an existing access grant for an artifact.

Endpoint: DELETE /v1/artifacts/{id}/access/{accessID}

Path Parameters:

id (UUID, required): Artifact ID.
accessID (UUID, required): Access grant ID to revoke.

Request Format:

Authorization: Bearer token required
No request body needed

Response Format:

Status Code: 204 No Content (success)

Status Codes:

204 No Content: Access revoked successfully.
400 Bad Request: Invalid UUIDs.
401 Unauthorized: Missing/invalid authentication headers or not authorized to manage this artifact.
404 Not Found: Artifact or access grant not found.
500 Internal Server Error: Persistence or unexpected error.

Shared Artifacts Endpoints🔗

These endpoints allow users to access artifacts that have been shared with them.

List Accessible Artifacts🔗

Returns all artifacts the current user has access to (owned or explicitly granted). Org-level users can access artifacts shared to their organization; non-admin users can access artifacts shared to their email.

Endpoint: GET /v1/shared/artifacts

Query Parameters:

type (string, optional): Filter by artifact type. Allowed values: checkpoint, metric, log, result.

Request Format:

Authorization: Bearer token required
No request body needed

Response Format:

Content-Type: application/json
Status Code: 200 OK (success)

Response Example:

[
  {
    "ID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
    "name": "model-checkpoint-epoch-10",
    "type": "checkpoint",
    "status": "active",
    "filesCount": 3,
    "createdAt": "2024-01-15T10:30:00Z",
    "updatedAt": "2024-01-15T11:00:00Z",
    "createdBy": {
      "givenName": "Jane",
      "familyName": "Doe",
      "orgId": "org_owner123",
      "orgName": "Owner Org"
    }
  },
  {
    "ID": "d1a2b3c4-5e6f-7890-a1b2-c3d4e5f6a7b8",
    "name": "shared-metrics",
    "type": "metric",
    "status": "active",
    "filesCount": 1,
    "createdAt": "2024-01-14T15:45:00Z",
    "updatedAt": "2024-01-14T16:00:00Z",
    "createdBy": {
      "orgId": "org_partner456",
      "orgName": "Partner Org"
    }
  }
]

Note: createdBy is privacy-filtered on shared endpoints (no email or roles).

Status Codes:

200 OK: Artifacts retrieved successfully.
400 Bad Request: Invalid query parameters.
401 Unauthorized: Missing/invalid authentication headers.
500 Internal Server Error: Persistence or unexpected error.

Get Shared Artifact Details🔗

Retrieve detailed information about a specific artifact you have access to.

Endpoint: GET /v1/shared/artifacts/{id}

Path Parameters:

id (UUID, required): Artifact ID.

Request Format:

Authorization: Bearer token required
No request body needed

Response Format:

Content-Type: application/json
Status Code: 200 OK (success)

Response Example:

{
  "ID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
  "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc",
  "type": "checkpoint",
  "name": "model-checkpoint-epoch-10",
  "metadata": {
    "epoch": 10,
    "accuracy": 0.95
  },
  "createdBy": {
    "givenName": "Jane",
    "familyName": "Doe",
    "orgId": "org_owner123",
    "orgName": "Owner Org"
  },
  "createdAt": "2024-01-15T10:30:00Z",
  "updatedAt": "2024-01-15T11:00:00Z",
  "status": "active",
  "files": [
    {
      "id": "d1fd87ef-20ee-430f-ab33-3369bac2cf98",
      "artifactID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
      "name": "model.pt",
      "hash": "5d41402abc4b2a76b9719d911017c592",
      "status": "completed",
      "url": "https://s3.amazonaws.com/bucket/path/to/file?presigned-params"
    },
    {
      "id": "e2fe98f0-31ff-541g-bc44-4470cbd3dga9",
      "artifactID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
      "name": "optimizer.pt",
      "hash": "098f6bcd4621d373cade4e832627b4f6",
      "status": "completed",
      "url": "https://s3.amazonaws.com/bucket/path/to/file?presigned-params"
    }
  ]
}

Status Codes:

200 OK: Artifact retrieved successfully.
400 Bad Request: Invalid artifact ID.
401 Unauthorized: Missing/invalid authentication headers or not authorized to access this artifact.
404 Not Found: Artifact not found or not accessible.
500 Internal Server Error: Unexpected error.

Note: File entries include a url only when the file status is completed.

Get Shared Artifact File🔗

Retrieve a pre-signed URL to download a file from a shared artifact.

Endpoint: GET /v1/shared/artifacts/{id}/file/{key}

Path Parameters:

id (UUID, required): Artifact ID.
key (string, required): File key/name.

Request Format:

Authorization: Bearer token required
No request body needed

Response Format:

Content-Type: application/json
Status Code: 200 OK (success)

Response Example:

{
  "url": "https://s3.amazonaws.com/bucket/path/to/file?presigned-params",
  "hash": "5d41402abc4b2a76b9719d911017c592"
}

Status Codes:

200 OK: Download info retrieved successfully.
400 Bad Request: Invalid artifact ID or file key.
401 Unauthorized: Missing/invalid authentication headers or not authorized to access this artifact.
404 Not Found: Artifact or file not found or not accessible.
409 Conflict: File upload still in progress.
500 Internal Server Error: Presign failed or S3 path missing.

Using Shared Artifacts in Computations🔗

When an artifact is shared with you via an access grant, you can use it as an input in your computations. Compute pods use the compute token (APH_AUTH_DATA_ARTIFACT_TOKEN) and access shared artifacts through /v1/artifacts/*.

A typical flow looks like this:

Receive or look up the artifact ID (Governance Portal or /v1/shared/artifacts).
Pass the artifact ID into your job payload or configuration when launching a job via the CLI.
In the computation container, use apheris_utils.artifacts.Artifact.get(<artifact_id>) to fetch files.

This enables you to:

Load model checkpoints from shared artifacts
Use shared metrics for comparison or validation
Access shared logs for debugging
Incorporate shared results into your workflows

The grant-based access model ensures that you have the necessary permissions to use these artifacts in your computational workflows while maintaining security and audit trails.

Error Handling and Status Codes🔗

The Data Artifacts API uses standard HTTP status codes to indicate the success or failure of requests. All error responses include a JSON body with an error message to help diagnose the issue.

Error Response Format🔗

All error responses have the following JSON structure:

{
  "error": "error description"
}

Note: The error message in the response body will specify the reason for the failure (e.g., "Invalid UUID format for jobID" or "Missing required field: name") to help with debugging.

HTTP Status Codes🔗

Status Code	Description	Common Causes
200 OK	Request succeeded	Successful GET or PATCH operations
201 Created	Resource created successfully	Successful POST or PUT operations
400 Bad Request	Invalid request	Malformed JSON, invalid UUIDs, missing required fields
401 Unauthorized	Authentication failed	Missing or invalid token, expired token
403 Forbidden	Permission denied	Token lacks necessary permissions
404 Not Found	Resource not found	Invalid artifact ID, job ID, or file key
409 Conflict	Resource conflict	Upload still in progress, status transition not allowed
500 Internal Server Error	Server-side error	Database errors, S3 errors, unexpected exceptions

Common Error Scenarios and Handling🔗

Validation Errors (400 Bad Request)🔗

Invalid UUID format: Ensure all UUIDs are in the correct format (e.g., "9b50c3ba-2c81-44f8-87b8-d8760f91fedc").

Missing required fields: Check that all required fields are included in the request body.

Invalid artifact type: Ensure the artifact type is one of the allowed values: checkpoint, metric, log, result.

Empty file key: File keys must not be empty strings.

Authentication Errors (401 Unauthorized)🔗

Missing token: Ensure the Authorization header is included with the Bearer token.

Invalid token: Verify the token is correctly formatted and signed.

Token validation failure: The token may be expired or tampered with.

Resource Errors (404 Not Found)🔗

Artifact not found: Verify the artifact ID exists.

Job not found: Ensure the job ID exists and is accessible.

File not found: Check that the file key is correct.

Conflict Errors (409 Conflict)🔗

Upload in progress: Wait for the upload to complete before attempting to download.

Status transition not allowed: File status can only transition from "pending" to "completed".

Server Errors (500 Internal Server Error)🔗

Persistence errors: Database connection issues or constraints violations.

Presign failures: S3 service unavailable or configuration issues.

Unexpected errors: Contact support with the error message and request details.

Error Handling Best Practices🔗

Retry with exponential backoff for 5xx errors, which may be temporary.
Do not retry for 4xx errors without fixing the underlying issue.
Log error responses for debugging purposes.
Handle specific error codes in your application to provide appropriate user feedback.
Include request IDs (if available in error responses) when reporting issues.

Data Models🔗

Artifact🔗

An artifact represents metadata about files generated during job execution.

{
  "ID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
  "jobID": "9b50c3ba-2c81-44f8-87b8-d8760f91fedc",
  "type": "checkpoint",
  "name": "model-checkpoint-1",
  "metadata": {
    "epoch": 10,
    "accuracy": 0.95
  },
  "createdBy": {
    "user": "user1"
  },
  "createdAt": "2023-01-01T12:00:00Z",
  "updatedAt": "2023-01-01T12:00:00Z",
  "status": "active"
}

ArtifactFile🔗

A file associated with an artifact.

{
  "id": "d1fd87ef-20ee-430f-ab33-3369bac2cf98",
  "artifactID": "c0fd87ef-20ee-430f-ab33-3369bac2cf97",
  "name": "model.pt",
  "s3Path": "artifacts/c0fd87ef-20ee-430f-ab33-3369bac2cf97/model.pt",
  "hash": "md5hash==",
  "status": "pending|completed"
}

Note: hash in ArtifactFile is the base64-encoded MD5 checksum (matches the Content-MD5 header). Download endpoints return a hex-encoded MD5 hash.

Data Artifacts API🔗

Purpose and Usage🔗

API Endpoints Overview🔗

Compute Token Endpoints (Create, Read, and Files)🔗

Access Management Endpoints (Grant/Revoke)🔗

Shared Artifacts Endpoints (User Access)🔗

Relevant Workflows🔗

Workflow 1: Creating and Uploading an Artifact (Artifact Owner)🔗

Workflow 2: Sharing an Artifact (Artifact Owner)🔗

Workflow 3: Accessing Shared Artifacts (Artifact Consumer)🔗

Authentication and Permission Requirements🔗

Authentication Method🔗

Token Type 1: Computation Tokens (For Compute Pods)🔗

Token Type 2: User Authentication Tokens (For Access Management and Sharing)🔗

Permission Scopes🔗

Organization-Based Access Controls🔗

Model 1: Ownership-Based Access (Computation Artifacts)🔗

Model 2: Grant-Based Access (Shared Artifacts)🔗

Request/Response Format with Examples🔗

Create Artifact🔗

Get Artifact by ID🔗

Add Artifact File🔗

Get Artifact File🔗

Update Artifact File Status🔗

Access Management Endpoints🔗

List Shareable Artifacts🔗

Grant Artifact Access🔗

List Access Grants🔗

Revoke Access Grant🔗

Shared Artifacts Endpoints🔗

List Accessible Artifacts🔗

Get Shared Artifact Details🔗

Get Shared Artifact File🔗

Using Shared Artifacts in Computations🔗

Error Handling and Status Codes🔗

Error Response Format🔗

HTTP Status Codes🔗

Common Error Scenarios and Handling🔗

Validation Errors (400 Bad Request)🔗

Authentication Errors (401 Unauthorized)🔗

Resource Errors (404 Not Found)🔗

Conflict Errors (409 Conflict)🔗

Server Errors (500 Internal Server Error)🔗

Error Handling Best Practices🔗

Data Models🔗

Artifact🔗

ArtifactFile🔗