## FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner<br><sub>Official PyTorch Implementation</sub>



This repo contains PyTorch model definition and training/sampling code for our paper. 


## Setup

We provide an [`environment.yml`](environment.yml) file that can be used to create a Conda environment. If you only want 
to run pre-trained models locally on CPU, you can remove the `cudatoolkit` and `pytorch-cuda` requirements from the file.

```bash
conda env create -f environment.yml
conda activate FlowTurbo
```

## Sampling 

**Pre-trained FlowTurbo checkpoints.** You can sample from our pre-trained FlowTurbo models with [`sample.py`](sample.py). Please Download the weights for our pre-trained FlowTurbo model.

```bash
python sample.py
```



## Training SiT

We provide a training script for FlowTurbo in [`train.py`](train.py). To launch FlowTurbo (256x256) training with `N` GPUs on 
one node:

```bash
CUDA_VISIBLE_DEVICES='1' torchrun --nnodes=1 --nproc_per_node=N --master_port 12345 train.py \
    --data-path /data/ILSVRC2012/train --global-batch-size 18 \
    --note 'debug' --ckpt-every 5000 --lr 5e-5 --vae_ckpt vae-ema --model_teacher_ckpt /pretrained_models/predictor.ckpt \
```



## Evaluation (FID, Inception Score, etc.)

We include a `sample_ddp_feature.py` script which samples a large number of images from a SiT model in parallel. This script 
generates a folder of samples as well as a `.npz` file which can be directly used in `evaluator.py` which we provided to compute FID, Inception Score and other metrics. For example, to sample 50K images from our pre-trained FlowTurbo model over `N` GPUs under default ODE sampler settings, run:

```bash
torchrun --nnodes=1 --nproc_per_node=N FlowTurbo/sample_ddp_feature.py
```