A quick hack to run Stable Diffusion on an Azure GPU Spot Instance.
This is an Azure Resource Manager template that automatically deploys a GPU enabled spot instance running Ubuntu 20.04.
The template defaults to deploying NV6 Series VMs (Standard_NV6, Standard_NV6_Promo or, if you can get them, Standard_NV6ads_A10_v5) with the smallest possible managed SSD disk size (P4, 32GB). It also deploys (and mounts) an Azure File Share on the machine with (very) permissive access at /srv, which makes it quite easy to keep copies of your work between VM instantiations.
You will need to set a HUGGINGFACE_TOKEN environment variable when running the Makefile, and the machine will reboot after installing almost everything (it will automatically install GFPGAN and other auxiliary models when you run webui.sh --listen the first time).
I was getting a little bored with the notebook workflow in Google Collab and wanted access to a more persistent GPU setup without breaking the bank (hence spot instances, which I can run on demand in my personal subscription).
- Automatically set up Tailscale with
--authkeyto remove need for Gradio - Built-in auto-shutdown (easy to set via the portal, but I will be adding it to the template)
- Experimental imaginAIry installation (just use
experimental.yamlinstead ofcloud-init.yaml) - Set up AUTOMATIC1111's pretty amazing Web UI
- change instance type to
Spotfor lower cost (also, removed availability set and changed SKU to be non-_Promo) - Install NVIDIA drivers and CUDA toolkit
- remove unused packages from
cloud-config - remove unnecessary commands from
Makefile - remove unnecessary files from repo and trim history
- fork from
azure-k3s-cluster, newREADME
Go to the Azure Resource Graph Explorer and enter this query to find the cheapest SKU/location combo for spot instances:
SpotResources
| where type =~ 'microsoft.compute/skuspotpricehistory/ostype/location'
| where sku.name in~ ('Standard_NV6','Standard_NV6ads_A10_v5')
| where properties.osType =~ 'linux'
| where location in~ ('westeurope','northeurope','eastus','eastus2')
| project skuName = tostring(sku.name), osType = tostring(properties.osType), location, latestSpotPriceUSD = todouble(properties.spotPrices[0].priceUSD)
| order by latestSpotPriceUSD asc
make keys- generates an SSH key for provisioningmake deploy-storage- deploys shared storagemake params- generates ARM template parametersmake deploy-compute- deploys VMmake view-deployment- view deployment statusmake watch-deployment- watch deployment progressmake ssh- opens an SSH session tomaster0and sets up TCP forwarding tolocalhostmake tail-cloud-init- opens an SSH session and tails thecloud-initlogmake list-endpoints- list DNS aliasesmake destroy-environment- destroys the entire environment (should not be the default)make destroy-compute- destroys only the compute resources (should be the default if you want to save costs)make destroy-storage- destroys the storage (should be avoided)
az login
make keys
make deploy-storage
make params
make deploy-compute
make view-deployment
# Go to the Azure portal and check the deployment progress
# Clean up after we're done working for the day, to save costs (preserves storage)
make destroy-compute
# Clean up the whole thing (destroys storage as well)
make destroy-environment
Azure Cloud Shell (which includes all the below in bash mode) or:
- Python 3
- The Azure CLI (
pip install -U -r requirements.txtwill install it) - GNU
make(you can just read through theMakefileand type the commands yourself)
Although it is possible to run SKUs like Standard_NV6ads_A10_v5 as spot instances, this should be considered experimental.
Pro Tip: You can set
STORAGE_ACCOUNT_GROUPandSTORAGE_ACCOUNT_NAMEinside an.envfile if you want to use a pre-existing storage account. As long as you usemaketo do everything, the value will be automatically overridden.
Keep in mind that this is not meant to be used as a production service.