Skip to main content

AWS ParallelCluster (Slurm)

Importing an AWS ParallelCluster engine connects an existing Slurm-based HPC cluster to AIchor without transferring ownership. AIchor submits workloads to the Slurm scheduler via the cluster's head node, while infrastructure management remains with the cluster administrator.

Prerequisites

  • The AWS ParallelCluster must already exist and be accessible.
  • An IAM role with the necessary permissions must be available. The role ARN follows the format arn:aws:iam::account-id:role/role-name.
  • The IP address of the head node must be known.
  • The CloudFormation stack ARN for the ParallelCluster deployment must be available.
  • The VPC ID and subnet IDs associated with the cluster must be known.
  • The Slurm REST API version running on the cluster must be identified (for example, v0.0.39).

Steps

  1. In the AIchor UI, open Engines and click Add Engine.
  2. Select In The Cloud, then AWS, then ParallelCluster.
  3. Select Import Existing Engine.
  4. Fill in the form fields described below and submit.

AWS ParallelCluster import form AWS ParallelCluster import form

Form fields

FieldRequiredDescription
Engine NameYesLowercase alphanumeric characters and hyphens. Must start with a letter.
Parallel Cluster NameYesName of the existing AWS ParallelCluster cluster.
EcosystemNoTag passed to infrastructure-as-code tooling. Required only for specific organisations on InstaDeep recommendation.
Head Node IPYesIP address of the ParallelCluster head node. Serves as the API endpoint equivalent for Kubernetes-based engines.
AWS RegionYesRegion where the cluster runs.
Assume Role ARNYesIAM role ARN with access to the cluster. Format: arn:aws:iam::account-id:role/role-name.
Slurm VersionYesVersion identifier for the Slurm REST API running on the cluster. Example: v0.0.39.
EFS Mount DirNoDirectory used for EFS mounting. Default: /mnt/shared.
Slurm UserNoUsername for Slurm task submission. Default: slurm.
VPC IDYesAWS VPC identifier associated with the cluster.
Subnets IDsYesComma-separated list of subnet IDs within the VPC.
PCluster Stack ARNYesCloudFormation stack ARN for the ParallelCluster deployment.

Authentication

Authentication to the ParallelCluster cluster is performed via IAM role assumption. The Assume Role ARN field specifies the IAM role that AIchor assumes when communicating with the cluster. Ensure the role has sufficient permissions to interact with the head node and the associated AWS resources.