The supplementary material contains
1. the proof of the main result
2. software to reproduce the experiments
3. pre-processed Pokec network data (for convenience)

This software builds on relational ERM https://github.com/wooden-spoon/relational-ERM

# Requirements
1. Python 3.6 with numpy and pandas
2. Tensorflow *1.11*
3. gcc


# Setup
Run the following command in src to build the graph sampler tensorflow ops:

python setup.py build_ext --inplace


# Reproducing the experiments
The default settings for the code match the settings used in the paper.
These match the default settings used by relational ERM (i.e., we didn't tune anything).

You'll run the code as 
`./relational_ERM/submit_scripts/run_model.sh`
Changing flags in this file will replicate different experiments.
The simulation setting is controlled by the `--simulated` flag. 
Options are attribute ('attribute') or propensity based ('propensity') simulation.
The later can be used to reproduce the exogeneity experiments.

To reproduce the two-stage training, run with `embedding_trainable=false`

# Misc.
The experiments in the paper initialize from node embeddings that were pre-trained using a purely unsupervised objective.
To recreate the initialization embeddings, run `run_unsupervised.sh`. Then, uncomment `--init_checkpoint=$INIT_FILE` in `run_classifer.sh`

