## Requirements
- This codebase mainly refers to the codebase of [EDM](https://github.com/NVlabs/edm). To install the required packages, please refer to the [EDM](https://github.com/NVlabs/edm) codebase.
- This codebase supports the pre-trained diffusion models from [EDM](https://github.com/NVlabs/edm), [LDM](https://github.com/CompVis/latent-diffusion) and [Stable Diffusion](https://github.com/runwayml/stable-diffusion). When you want to load the pre-trained diffusion models from these codebases, please refer to the corresponding codebases for package installation.

## Training, Sampling and Evaluation
Run the command below to distill a pre-trained diffusion model using our methods with specified settings. This command can be parallelized across multiple GPUs by adjusting ```--nproc_per_node```. You can find the descriptions to all the parameters in the next section. To train a 2-NFE SFD, run the following commands:
```.bash
# SFD on CIFAR10
SOLVER_FLAGS="--sampler_stu=euler --sampler_tea=dpmpp --num_steps=4 --M=3 --afs=True"
SCHEDULE_FLAGS="--schedule_type=polynomial --schedule_rho=7"
ADDITIONAL_FLAGS="--max_order=3 --predict_x0=True --lower_order_final=True"
GUIDANCE_FLAGS=""
torchrun --standalone --nproc_per_node=1 train.py \
--dataset_name="cifar10" \
--model_path="/path/to/your/pre-trained/model" \
--batch=128 \
--lr=5e-5 \
--total_kimg=200 \
--use_step_condition=False \
$SOLVER_FLAGS \
$SCHEDULE_FLAGS \
$ADDITIONAL_FLAGS \
$GUIDANCE_FLAGS
```

```.bash
# SFD on ImageNet64
SOLVER_FLAGS="--sampler_stu=euler --sampler_tea=dpmpp --num_steps=4 --M=3 --afs=True"
SCHEDULE_FLAGS="--schedule_type=polynomial --schedule_rho=7"
ADDITIONAL_FLAGS="--max_order=3 --predict_x0=True --lower_order_final=True"
GUIDANCE_FLAGS=""
torchrun --standalone --nproc_per_node=1 train.py \
--dataset_name="imagenet64" \
--model_path="/path/to/your/pre-trained/model" \
--batch=128 \
--lr=1e-5 \
--total_kimg=200 \
--use_step_condition=False \
$SOLVER_FLAGS \
$SCHEDULE_FLAGS \
$ADDITIONAL_FLAGS \
$GUIDANCE_FLAGS
```

```.bash
# SFD on latent-space LSUN-Bedroom
SOLVER_FLAGS="--sampler_stu=euler --sampler_tea=dpmpp --num_steps=4 --M=3 --afs=True"
SCHEDULE_FLAGS="--schedule_type=discrete --schedule_rho=1"
ADDITIONAL_FLAGS="--max_order=3 --predict_x0=False --lower_order_final=True"
GUIDANCE_FLAGS="--guidance_type=uncond --guidance_rate=1"
torchrun --standalone --nproc_per_node=1 train.py \
--dataset_name="lsun_bedroom_ldm" \
--model_path="/path/to/your/pre-trained/model" \
--batch=128 \
--lr=1e-5 \
--total_kimg=200 \
--use_step_condition=False \
$SOLVER_FLAGS \
$SCHEDULE_FLAGS \
$ADDITIONAL_FLAGS \
$GUIDANCE_FLAGS
```


```.bash
# SFD on Stable Diffusion v1.5
SOLVER_FLAGS="--sampler_stu=euler --sampler_tea=dpmpp --num_steps=3 --M=2 --afs=False"
SCHEDULE_FLAGS="--schedule_type=discrete --schedule_rho=1"
ADDITIONAL_FLAGS="--max_order=2 --predict_x0=False --lower_order_final=True"
GUIDANCE_FLAGS="--guidance_type=cfg --guidance_rate=7.5"
torchrun --standalone --nproc_per_node=4 train.py \
--dataset_name="ms_coco" \
--model_path="/path/to/your/pre-trained/model" \
--batch=16 \
--lr=5e-5 \
--total_kimg=100 \
--use_step_condition=False \
$SOLVER_FLAGS \
$SCHEDULE_FLAGS \
$ADDITIONAL_FLAGS \
$GUIDANCE_FLAGS
```

To train SFD-v, run the following commands:
```
# SFD-v on CIFAR10 for NFE from 2 to 5, scripts on other datasets are similar
SOLVER_FLAGS="--sampler_stu=euler --sampler_tea=dpmpp --M=3 --afs=True"
SCHEDULE_FLAGS="--schedule_type=polynomial --schedule_rho=7"
ADDITIONAL_FLAGS="--max_order=3 --predict_x0=True --lower_order_final=True"
GUIDANCE_FLAGS=""
torchrun --standalone --nproc_per_node=1 train.py \
--dataset_name="cifar10" \
--model_path="/path/to/your/pre-trained/model" \
--batch=128 \
--lr=5e-5 \
--total_kimg=800 \
--use_step_condition=True \
$SOLVER_FLAGS \
$SCHEDULE_FLAGS \
$ADDITIONAL_FLAGS \
$GUIDANCE_FLAGS
```


After finishing the training, the fine-tuned model will be saved at "./exps" with a five digit experiment number (e.g. 00001) by default. The training setting which will be used for sampling is stored in the fine-tuned model. To sample with SFD/SFD-v, input the path or the exp number (e.g. 1) of the distilled model in ```--model_path```:
```.bash
# SFD/SFD-v sampling
torchrun --standalone --nproc_per_node=1 sample.py \
--model_path="/path/or/exp/number/of/SFD" \
--dataset_name="name of the dataset" \
--batch=128 \
--seeds="0-49999"       # 0-4999 for Stable Diffusion
# --num_steps={step}    # specify this for SFD-v
```

To compute Fréchet inception distance (FID) for a distilled model, first generate 50000 random images (5000 for stable diffusion) and then compare them against the dataset reference statistics:
```.bash
# FID evaluation
python fid.py calc --images=path/to/images --ref=path/to/fid/stat
```
For Stable Diffusion, we use CLIP-ViT-g-14 model pre-trained on Laion-2B to evaluate the CLIP Score:
```.bash
# CLIP Score evaluation
python clip-score.py calc --images=path/to/images
```

## Description of Parameters
| Name | Paramater | Default | Description |
|------|-----------|---------|-------------|
|General options|dataset_name|None|One in ['cifar10', 'imagenet64', 'lsun_bedroom_ldm', 'ms_coco']|
|               |model_path|None|Path to the pre-trained diffusion models or distilled SFD|
|               |batch|128|Total batch size|
|               |seeds|0-49999|Specify a different random seed for each image|
|               |total_kimg|200|Total training images (k)|
|               |use_step_condition|False|Use step condition for SFD-v training|
|SOLVER_FLAGS|sampler_stu|'euler'|Student solver|
|            |sampler_tea|'dpmpp'|Teacher solver. One in ['heun', 'dpm', 'dpmpp']|
|            |num_steps|4|Number of timestamps for the student solver|
|            |M (=K-1)|3|How many intermediate timestamps to insert between two adjacent steps for the teacher solver. We use K in our paper where M=K-1|
|            |afs|True|Whether to use AFS which saves the first model evaluation|
|SCHEDULE_FLAGS|sigma_min|0.006|Lowest noise level. Specified when loading the pre-trained models|
|              |sigma_max|80.|Highest noise level. Specified when loading the pre-trained models|
|              |schedule_type|'polynomial'|Time discretization schedule. One in ['polynomial', 'logsnr', 'time_uniform', 'discrete']|
|              |schedule_rho|7|Time step exponent. Need to be specified when schedule_type in ['polynomial', 'time_uniform', 'discrete']|
|ADDITIONAL_FLAGS|max_order|None|Option for DPM-Solver++. 1<=max_order<=3. 2 is recommended for Stable Diffusion and 3 else|
|                |predict_x0|True|Option for DPM-Solver++. Whether to use the data prediction formulation. Flase is recommended for LDM models and True else|
|                |lower_order_final|True|Option for DPM-Solver++. Whether to lower the order at the final stages of sampling.|
|GUIDANCE_FLAGS|guidance_type|None|One in ['cg', 'cfg', 'uncond', None]. 'cg' for classifier-guidance, 'cfg' for classifier-free-guidance used in Stable Diffusion, and 'uncond' for unconditional used in LDM|
|              |guidance_rate|None|Guidance scale|


## Pre-trained Diffusion Models
We perform sampling on a variaty of pre-trained diffusion models from different codebases including
[EDM](https://github.com/NVlabs/edm), [LDM](https://github.com/CompVis/latent-diffusion) and [Stable Diffusion](https://github.com/runwayml/stable-diffusion). The tested pre-trained models are listed below:

| Codebase | Dataset | Resolusion | Pre-trained Models | Description |
|----------|---------|------------|--------------------|-------------|
|EDM|CIFAR10|32|[edm-cifar10-32x32-uncond-vp.pkl](https://nvlabs-fi-cdn.nvidia.com/edm/pretrained/edm-cifar10-32x32-uncond-vp.pkl)
|EDM|ImageNet|64|[edm-imagenet-64x64-cond-adm.pkl](https://nvlabs-fi-cdn.nvidia.com/edm/pretrained/edm-imagenet-64x64-cond-adm.pkl)
|LDM|LSUN_bedroom|256|[lsun_bedroom.pt](https://openaipublic.blob.core.windows.net/diffusion/jul-2021/lsun_bedroom.pt) and [vq-f4 model](https://ommer-lab.com/files/latent-diffusion/vq-f4.zip)|Latent-space
|Stable Diffusion|MS-COCO|512|[stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.ckpt)|Classifier-free-guidance

Place the downloaded vq-f4 model at `./models/ldm_models/first_stage_models/vq-f4/model.ckpt`

## Reference Statistics
For the evaluations of FID/CLIP Scores, we use the FID statistics and MS-COCO prompts provided by the [AMED-Solver](https://github.com/zju-pi/diff-sampler) repository. 
