# Dependencies
- pytorch library (used version 1.7.0 with CUDA 11.0)
- pyhessian library ([pyhessian](https://github.com/amirgholami/PyHessian))

# Note
All codes assume the usage of GPU.

# For reparametrization invariance experiments (Section 3.2):
See `reparametrization_invariance_2layerNNs.ipynb`.

# For sharpness measures vs. the generalization gap experiments (Section 3.3):

### To train the models to investigate the correlation between sharpness measures and the generalization gap:
Run `main_cor.py` (please refer to the following example commands).

(Note: required data will be automatically downloaded and extracted when running the below commands.)

- CIFAR-10 experiments: 
```
python3 main_cor.py --gpu 0 --no-label-smoothing --batch-size 64 --lr 0.05 --seeds 1
```
- MNIST experiments: 
```
python3 main_cor.py --gpu 0 --dataset 'mnist' --no-label-smoothing --momentum 0.0 --weight-decay 0.0 --batch-size 64 --lr 0.05 --stop_loss 0.01 --seeds 1
```

### For calculating IGS or m-IGS and trace of the Hessian:
See `calculate_m_IGS_trH_example_CIFAR10.ipynb` or `calculate_m_IGS_trH_example_MNIST.ipynb`.

### For calculating other sharpness measures (the Fisher-Rao norm, Rangamani et al.'s, and Petzka et al.'s measures):
See `calculate_other_measures_example_CIFAR10.ipynb` or `calculate_other_measures_example_MNIST.ipynb`.

# For toy examples (Section 4.1):
See `Toy_example_SGD.ipynb` and `Toy_example_IGSreg.ipynb`.

# For the regularization experiments using MNIST and CIFAR-10/100 (Section 4.2):
Run `main.py` (please refer to the following example commands).

(Note: required data will be automatically downloaded and extracted when running the below commands.)

- SGD: 
```
python3 main.py --gpu 0 --dataset 'cifar100' --method 'SGD' --model 'resnet20' --epochs 200 --lr 0.1 --no-parallel --name cifar100_SGD_res20 --T_0 200 --lr-scheduler cosine_warmup --no-grad-normalize --seeds 1
```
- GR: 
```
python3 main.py --gpu 0 --dataset 'cifar100' --method 'GR' --model 'resnet20' --epochs 200 --lr 0.1 --no-parallel --name cifar100_GR_res20_rho0.02 --rho 0.02 --reg_start_epoch 10 --T_0 200 --lr-scheduler cosine_warmup --no-grad-normalize --seeds 1
```
- SAM: 
```
python3 main.py --gpu 0 --dataset 'cifar100' --method 'SAM' --model 'resnet20' --epochs 200 --lr 0.1 --no-parallel --name cifar100_SAM_res20_rho0.2 --rho 0.2 --reg_start_epoch 10 --T_0 200 --lr-scheduler cosine_warmup --no-grad-normalize --seeds 1
```
- ASAM: 
```
python3 main.py --gpu 0 --dataset 'cifar100' --method 'ASAM' --model 'resnet20' --epochs 200 --lr 0.1 --no-parallel --name cifar100_ASAM_res20_rho0.2 --rho 0.2 --reg_start_epoch 10 --T_0 200 --lr-scheduler cosine_warmup --no-grad-normalize --seeds 1
```
- Our method (IGSreg):
```
python3 main.py --gpu 0 --dataset 'cifar100' --method 'IGSreg' --model 'resnet20' --epochs 200 --lr 0.1 --no-parallel --name cifar100_IGS_res20_rho0.005_eta0.01_freq500 --rho 0.005 --eta 0.01 --update_freq 500 --save_freq 2 --reg_start_epoch 10 --T_0 200 --lr-scheduler cosine_warmup --no-grad-normalize --seeds 1
```
