# IR-Benchmark

This repository contains the code for the submission to the NeurIPS 2024, ID 11798.

## Installation

We provide a `environment.yml` file to create a conda environment with all the dependencies. To create the environment, run the following command:

```bash
conda env create -f environment.yml
```

This will create a conda environment named `itemrec`. To activate the environment, run:

```bash
conda activate itemrec
```

NOTE!!! The above installation may not work on all machines. If it does not work, you may need to install the dependencies manually, including:
```
torch==2.2.1
tqdm==4.66.2
nni==2.10.1
numpy==1.26.4
pandas==2.2.1
scipy==1.12.0
cppimport==22.8.2
scikit-learn==1.4.1.post1
matplotlib==3.8.4
```


## Code Structure

The code is organized as follows:

```
PSL-NIPS-code
│   README.md                           # This file
│   environment.yml                     # Conda environment file
|   run_nni.py                          # NNI hyperparameter tuning script
│   data                                # IID datasets
│   │   gowalla                         # Gowalla IID dataset
|   data_ood                            # OOD datasets
|   |   yelp2018-popularity             # Yelp OOD dataset
│   itemrec                             # Main package
|   |   __main__.py                     # Main script to run 
|   |   cli.py                          # CLI
|   |   hyper.py                        # NNI hyperparameter tuning
|   |   args.py                         # Argument parsing
|   |   ... (other modules)             # Other modules
```

## CLI

IR-Benchmark provides a CLI to run the experiments. To see the available commands, run:

```bash
python -m itemrec --help
```

In general, the CLI follows the following structure:

```bash
python -u -m itemrec [-h] [-v] --log LOG --save_dir SAVE_DIR --seed SEED 
model [--model_args ...] dataset [--dataset_args ...] optim [--optim_args ...]
```

where `model`, `dataset`, and `optim` are the subcommands to specify the model, dataset, and optimization algorithm, respectively. Each subcommand has its own set of arguments. Please see the help message or `itemrec/args.py` for more information.

For example, if you want to run the `PSL-relu` on the IID `gowalla` dataset and `MF` backbone, you may run:

```bash
python -u -m itemrec --log=/path/to/your/logs/gowalla/MF/PSL-relu/ir.log --save_dir=/path/to/your/logs/gowalla/MF/PSL-relu --seed=2024 model --emb_size=64 --norm --num_epochs=200 MF dataset --data_path=/path/to/your/data/gowalla/proc --batch_size=1024 --num_workers=16 optim --lr=0.1 --weight_decay=0.0 PSL --neg_num=1000 --tau=2.0 --tau_star=0.05 --method=1 --activation=relu
```

You should replace the paths with the actual paths on your machine.


## NNI Hyperparameter Tuning

A more easy way to run the code is to use our hyperparameter tuning script, i.e., `./run_nni.py`. This script uses the NNI framework to run hyperparameter tuning experiments. You only need to modify the following paths in the script:

```python
# main function -----------------------------------------------------
def main():
    args = parse_args()
    # TODO: /path/to/your/ must be replaced with the actual paths
    save_dir = f"/path/to/your/logs/{args.dataset}/{args.model}/{args.optim}"
    if not args.ood:
        dataset_path = f"/path/to/your/data/{args.dataset}/proc"
    else:
        dataset_path = f"/path/to/your/data_ood/{args.dataset}/proc"
    ...
    # NNI experiment
    ...
    # TODO: /path/to/your/code must be replaced with the actual path
    experiment.config.trial_code_directory = '/path/to/your/code'
    # TODO: specify the port and GPU
    experiment.config.training_service.platform = 'local'
    experiment.config.training_service.use_active_gpu = True
    experiment.config.training_service.max_trial_number_per_gpu = 2
    experiment.config.training_service.gpu_indices = [0, 1, 2, 3]
    ...
    experiment.run(args.port)
```

`run_nni.py` also provides a CLI to run the code, which is more user-friendly than the main CLI, since most the arguments are automatically set by the script. For example, if you want to run the `LightGCN` on the IID `gowalla` dataset with `Softmax`, you may run:

```bash
python run_nni.py --model=LightGCN --dataset=gowalla --optim=Softmax --norm --num_layers=2 --port=10032
```

If you want to run the `MF` on the OOD `yelp2018-popularity` dataset with `PSL-relu`, you may run:
```bash
python run_nni.py --model=MF --dataset=yelp2018-popularity --optim=PSL --norm --ood --method=1 --activation=relu --port=10033
```

This will run the hyperparameter tuning experiments defined in `./itemrec/hyper.py` and save the results in the specified `save_dir`.

