# Code to run AOT algorithm using HuggingFace Alignment Handbook and TRL library

- Here is the list of files changed in the Handbook to enable AOT
  - scripts/run_dpo.py 
  - src/alignment/configs.py
  
- List of files changed in TRL to enable AOT:
  - trl/trainer/dpo_trainer.py

- To enable soft sorting in AOT, please install softsort: pip install softsort

- Script to run AOT paired:
```shell
cd alignment-handbook/scripts
ACCELERATE_LOG_LEVEL=info accelerate launch  --config_file ../recipes/accelerate_configs/deepspeed_zero3.yaml
                                             --num_processes=8 
                                             run_dpo.py  ../recipes/zephyr-7b-beta/dpo/config_qlora.yaml 
                                             --load_in_4bit=False 
                                             --push_to_hub=False  
                                             --per_device_train_batch_size=35 
                                             --output_dir="../data/test" 
                                             --model_name_or_path="alignment-handbook/zephyr-7b-sft-full" 
                                             --dpo="aot" 
                                             --sort_type="soft" 
                                             --data_type="paired" 
                                             --loss_type="sigmoid"
```

- Script to run AOT unpaired (but still use paired UltraFeedback binarized):
```shell
cd alignment-handbook/scripts
ACCELERATE_LOG_LEVEL=info accelerate launch  --config_file ../recipes/accelerate_configs/deepspeed_zero3.yaml
                                             --num_processes=8 
                                             run_dpo.py  ../recipes/zephyr-7b-beta/dpo/config_qlora.yaml 
                                             --load_in_4bit=False 
                                             --push_to_hub=False  
                                             --per_device_train_batch_size=35 
                                             --output_dir="../data/test" 
                                             --model_name_or_path="alignment-handbook/zephyr-7b-sft-full" 
                                             --dpo="aot" 
                                             --sort_type="soft" 
                                             --data_type="unpaired" 
                                             --loss_type="sigmoid"
```