Technical requirements
This section outlines the necessary specifications, resources, and configurations required for the successful deployment of AIchor on cloud engines.
It provides detailed descriptions of accounts, permissions, network, and parameters needed, ensuring compatibility and performance standards are met.
This section is critical for guiding AIchor administrators through the technical foundation, ensuring all components function cohesively within the intended environment. Understanding and meeting these requirements is essential for maintaining system reliability, scalability, and expected behaviour.
Depending on the cloud provider and whether the administrator intends to create or import an engine, requirements might vary.
Sensitive data storage
Sensitive data used for trainings is securely stored in designated storage buckets to ensure data privacy and compliance with security standards.
These buckets are configured with strict access controls and server-side encryption protocols to safeguard the information from unauthorized access.
Additionally, sensitive environment variables, which include configuration details specified by the usersto run AIchor experiments, are stored in a dedicated database. This database is protected by encryption and role-based access mechanisms to ensure that only authorized processes or users can retrieve or modify these variables, ensuring system integrity and security.
AWS
The initial requirement to run workloads on a target AWS environment is to have an AWS account available beforehand.
Some services created by AIchor or by the client administrator are public resources although accessing them requires restricted permissions.
- S3 buckets: created by AIchor and accessible by specific users/experiments who have access to respective AIchor projects
- ECR: registry created by AIchor for each project created
- Public VPC: either created by AIchor (create EKS case) or by the administrator (import EKS case)
The resources below need to be created either by AIchor or the administrator on the client account if the engine is deployed on the client AWS account:
Resources | Creation | Import |
---|---|---|
EKS (engine) | Created by AIchor | Created by administrator |
VPC | Created by AIchor | Created by administrator |
Subnet | Created by AIchor | Created by administrator |
Internet Gateway | Created by AIchor | Created by administrator |
Route | Created by AIchor | Created by administrator |
Route table | Created by AIchor | Created by administrator |
Node group | Created by AIchor | Created by administrator |
IAM/OIDC/Instance profile (*) | Created by AIchor | Created by administrator |
Queue/SQS | Created by AIchor | Created by administrator |
Karpenter | Created by AIchor | Created by administrator |
EFS | Created by AIchor | Created by administrator |
ParallelCluster (engine) | Created by AIchor | Created by administrator |
State machine | Created by AIchor | Created by administrator |
Lambda functions | Created by AIchor | Created by administrator |
Step functions | Created by AIchor | Created by administrator |
(*) The permissions below are required for AIchor to be able to import EKS engines Annex1.
Import EKS engine
To be able to import an existing EKS cluster, the below conditions have to be met:
- An ARN role with the policies allowing the actions specified in Annex1 applied.
Those policies allow AIchor to perform all expected tasks on the target AWS account such as:- Create and manage the required IAM roles for AIchor
- Create and manage storage (S3) buckets on the target account
- Create and manage docker registry (ECR) on the target account
- Manage EKS clusters on the target account
Note Those policies are being optimized and a more tailored list of permissions will be published.
Whitelist the following hostnames from the target engine to allow traffic between AIchor and the target engine
- instadeep-infra.eu.auth0.com (authentication)
- ichorai.eu.auth0.com (authentication)
- *.aichor.ai (AIchor)
Provide the storageclass, this is an input on the form when an EKS cluster is being imported.
NLB public ip address to access the EKS engine
Create EKS engine
To be able to create an existing EKS cluster, the below condition has to be met:
- An ARN role with the policies specified in Annex1 applied.
Those policies allow AIchor to perform all expected tasks on the target AWS account such as:- Create and manage the required IAM roles for AIchor
- Create and manage storage (S3) buckets on the target account
- Create and manage docker registry (ECR) on the target account
- Create and Manage EKS clusters on the target account
AIchor in the creation scenario creates the required resources when assuming the ARN role on the target account in the specified region (Input parameter on the engine creation form).
Import ParallelCluster
GCP
Import GKE
Create GKE
MS Azure
Import AKS
Create AKS
Annexes
Annex 1
"iam:CreateRole",
"iam:CreatePolicy",
"iam:CreatePolicyVersion",
"iam:AttachRolePolicy",
"iam:PutRolePolicy",
"iam:UpdateAssumeRolePolicy",
"iam:UpdateRole",
"iam:GetRole",
"iam:GetPolicy",
"iam:GetPolicyVersion",
"iam:ListAttachedRolePolicies",
"iam:ListRolePolicies",
"iam:ListInstanceProfilesForRole",
"iam:ListPolicyVersions",
"iam:GetRolePolicy",
"iam:TagRole",
"iam:ListRoleTags",
"iam:UntagRole",
"iam:TagPolicy",
"iam:UntagPolicy",
"iam:DeleteRole",
"iam:DeletePolicy",
"iam:DeletePolicyVersion",
"iam:DetachRolePolicy",
"iam:DeleteRolePolicy"
"s3:DeleteAccessPoint",
"s3:DeleteAccessPointForObjectLambda",
"s3:GetStorageLensGroup",
"s3:PutLifecycleConfiguration",
"s3:PutObjectTagging",
"s3:DeleteObject",
"s3:PutAccessPointPolicyForObjectLambda",
"s3:GetBucketWebsite",
"s3:DeleteStorageLensConfigurationTagging",
"s3:GetObjectAttributes",
"s3:DeleteObjectVersionTagging",
"s3:InitiateReplication",
"s3:GetObjectLegalHold",
"s3:GetBucketNotification",
"s3:DeleteBucketPolicy",
"s3:GetReplicationConfiguration",
"s3:DescribeMultiRegionAccessPointOperation",
"s3:PutObject",
"s3:PutBucketNotification",
"s3:PutObjectVersionAcl",
"s3:PutBucketObjectLockConfiguration",
"s3:PutAccessPointPolicy",
"s3:GetStorageLensDashboard",
"s3:GetLifecycleConfiguration",
"s3:UntagResource",
"s3:GetBucketTagging",
"s3:GetInventoryConfiguration",
"s3:GetAccessPointPolicyForObjectLambda",
"s3:ReplicateTags",
"s3:ListBucket",
"s3:AbortMultipartUpload",
"s3:PutBucketTagging",
"s3:DeleteBucket",
"s3:PutBucketVersioning",
"s3:ListBucketMultipartUploads",
"s3:PutIntelligentTieringConfiguration",
"s3:PutMetricsConfiguration",
"s3:PutStorageLensConfigurationTagging",
"s3:PutObjectVersionTagging",
"s3:GetBucketVersioning",
"s3:GetAccessPointConfigurationForObjectLambda",
"s3:PutInventoryConfiguration",
"s3:ObjectOwnerOverrideToBucketOwner",
"s3:GetStorageLensConfiguration",
"s3:DeleteStorageLensConfiguration",
"s3:PutBucketWebsite",
"s3:PutBucketRequestPayment",
"s3:PutObjectRetention",
"s3:CreateAccessPointForObjectLambda",
"s3:GetBucketCORS",
"s3:DeleteAccessPointPolicy",
"s3:GetObjectVersion",
"s3:PutAnalyticsConfiguration",
"s3:PutAccessPointConfigurationForObjectLambda",
"s3:GetObjectVersionTagging",
"s3:CreateBucket",
"s3:GetStorageLensConfigurationTagging",
"s3:ReplicateObject",
"s3:GetObjectAcl",
"s3:GetBucketObjectLockConfiguration",
"s3:DeleteBucketWebsite",
"s3:GetIntelligentTieringConfiguration",
"s3:DeleteAccessPointPolicyForObjectLambda",
"s3:GetObjectVersionAcl",
"s3:PutBucketAcl",
"s3:DeleteObjectTagging",
"s3:GetBucketPolicyStatus",
"s3:GetObjectRetention",
"s3:TagResource",
"s3:PutObjectLegalHold",
"s3:PutBucketCORS",
"s3:ListMultipartUploadParts",
"s3:GetObject",
"s3:PutBucketLogging",
"s3:GetAnalyticsConfiguration",
"s3:GetObjectVersionForReplication",
"s3:GetAccessPointForObjectLambda",
"s3:CreateAccessPoint",
"s3:PutAccelerateConfiguration",
"s3:DeleteObjectVersion",
"s3:GetBucketLogging",
"s3:ListBucketVersions",
"s3:RestoreObject",
"s3:GetAccelerateConfiguration",
"s3:GetObjectVersionAttributes",
"s3:GetBucketPolicy",
"s3:ListTagsForResource",
"s3:PutEncryptionConfiguration",
"s3:GetEncryptionConfiguration",
"s3:GetObjectVersionTorrent",
"s3:GetBucketRequestPayment",
"s3:GetAccessPointPolicyStatus",
"s3:DeleteStorageLensGroup",
"s3:GetObjectTagging",
"s3:GetBucketOwnershipControls",
"s3:GetMetricsConfiguration",
"s3:PutObjectAcl",
"s3:GetBucketPublicAccessBlock",
"s3:PutBucketPublicAccessBlock",
"s3:GetAccessPointPolicyStatusForObjectLambda",
"s3:UpdateStorageLensGroup",
"s3:PutBucketOwnershipControls",
"s3:GetBucketAcl",
"s3:BypassGovernanceRetention",
"s3:GetObjectTorrent",
"s3:PutBucketPolicy",
"s3:GetBucketLocation",
"s3:GetAccessPointPolicy",
"s3:ReplicateDelete"
"s3:ListStorageLensConfigurations",
"s3:ListAccessPointsForObjectLambda",
"s3:GetAccessPoint",
"s3:PutAccountPublicAccessBlock",
"s3:GetAccountPublicAccessBlock",
"s3:ListAllMyBuckets",
"s3:ListAccessPoints",
"s3:PutAccessPointPublicAccessBlock",
"s3:CreateStorageLensGroup",
"s3:PutStorageLensConfiguration",
"s3:ListMultiRegionAccessPoints",
"s3:ListStorageLensGroups"
"ecr:CreateRepository",
"ecr:ListTagsForResource",
"ecr:TagResource",
"ecr:UntagResource",
"ecr:SetRepositoryPolicy",
"ecr:PutLifecyclePolicy",
"ecr:PutImageTagMutability",
"ecr:GetRepositoryPolicy",
"ecr:GetLifecyclePolicy",
"ecr:DescribeRepositories",
"ecr:DeleteRepositoryPolicy",
"ecr:DeleteRepository",
"ecr:DeleteLifecyclePolicy"
"eks:DescribeCluster"
"ssm:GetParameter",
"ssm:GetParameters",
"ssm:ListTagsForResource"
"ssm:DescribeParameters"
"kms:Decrypt"