Triggering an Experiment
Requirements
Two files must be present in the repository before an experiment can be submitted:
Dockerfile— defines the image that will be built and run. The path is specified in the manifest.manifest.yaml— the AIchor manifest to be used needs to be in aaichor_manifests/directory in the repository root. It specifies which Dockerfile to use for the image build and configures the execution runtime: the operator (KubeRay, JobSet, etc.), resource requests, and the command to run. See the Manifest Reference for the full set of fields.
Preparing a repository from scratch
aichor local-repo init generates a complete, ready-to-submit project structure including a Dockerfile, manifest.yaml, pyproject.toml, and a minimal source layout:
aichor local-repo init
The generated structure:
<project_name>/
├── Dockerfile
├── README.md
├── pyproject.toml
├── aichor_manifests/
│ └── manifest.yaml
└── src/
└── <project_name>/
├── __init__.py
└── main.py
The CLI command runs interactively, prompting for the operator (execution runtime), resource profile, and other settings. The generated manifest and Dockerfile are pre-configured based on those answers. The Dockerfile uses uv for fast, reproducible image builds.
See aichor local-repo init in the CLI reference for details.
Alternatively the AIchor team maintains a repository with some demos ready to use at https://github.com/instadeepai/aichor-demo.
Triggering an experiment
Experiments can be triggered in two ways.
Via a Git commit (webhook)
When a project is added to AIchor, a webhook is created on the repository. One you push a commit, this triggeres the webhook to forward the commit metadata to AIchor. AIchor then parses the commit message.
The message in the git commit must adhere to the following structure
aichor[<manifest-path>]: <commit-message>
<manifest-path> is the path to the manifest relative to aichor_manifests/ folder present at the root of the repository. Subfolders are supported, allowing multiple manifests to be organised within the directory. When such a commit is pushed, the project webhook fires and AIchor starts processing the experiment automatically using the referenced manifest.
# Manifest at aichor_manifests/manifest.yaml
git commit -m "aichor[manifest.yaml]: train ResNet on ImageNet"
git push
# Manifest in a subfolder, e.g. aichor_manifests/resnet/manifest.yaml
git commit -m "aichor[resnet/manifest.yaml]: train ResNet on ImageNet"
git push
If your VCS provider (such as github) is experiencing degraded service, your experiment may not trigger. Alternatively, you can use the cli to trigger experiments.
Via the AIchor CLI
Experiments can also be submitted without a Git push using the AIchor CLI:
- The local submit option packages a local directory and sends it to AIchor directly.
- The commit-sha option triggers an experiment from a commit already present in the repository.
Both submit commands return a JSON object containing the new experiment_id, which can be captured in a script using something like jq.
See the CLI Reference for the aichor experiments submit local and aichor experiments submit commit-sha commands.
Resubmitting
Resubmitting re-executes the code associated to an experiment. This is useful for retrying a failed or cancelled experiment.
This can be done in the UI or through the AIchor CLI (see aichor experiments resubmit).