# ITSELF and GUIDE for *any-MPE* Inference Task

We propose a novel neural networks based approach to efficiently answer arbitrary Most Probable Explanation (MPE) queries—a well-known NP-hard task—in large probabilistic models such as Markov networks, probabilistic circuits, and neural auto-regressive models. By arbitrary MPE queries, we mean that there is no predefined partition of variables into evidence and non-evidence variables. The key idea is to distill all MPE queries over a given probabilistic model into a neural network, eliminating the need for time-consuming inference algorithms on the probabilistic model. We improve upon this idea by incorporating inference-time optimization with self-supervised loss to iteratively refine the solutions and employ a teacher-student framework to provide a better initial network to reduce the number of inference-time optimization steps. The teacher network utilizes a self-supervised loss function optimized for getting the exact MPE solution, while the student network learns from the teacher's near-optimal outputs through supervised loss. We demonstrate the efficacy and scalability of our approach on various datasets and a broad class of probabilistic models, showcasing its practical effectiveness.

# Table of Contents

- [Installation](#installation)
- [Datasets](#Datasets)
- [Usage](#usage)

## Installation

To facilitate dependency installation, two files are included: `requirements.txt` and `any-mpe.yml`. This ensures a seamless setup process for running the Python scripts associated with the project.

The inclusion of the "./anympe" directory in the Python path is necessary. Several approaches can achieve this:

1. **Using sys.path.append:** In your Python code, you can use the following lines to add a directory to the Python path:

   ```python
   import sys
   sys.path.append("./anympe")
   ```
2. **Setting the PYTHONPATH environment variable:** You can set the `PYTHONPATH` environment variable to include a directory in the Python path. In Unix-like systems, use:

   ```bash
   export PYTHONPATH="./anympe":$PYTHONPATH
   ```

## Datasets

- The TPM datasets are placed in the directory can be downloaded from `https://github.com/UCLA-StarAI/Density-Estimation-Datasets/blob/master/README.md`. These are used for experiments related to the PCs and MADE models.
- Datasets for PGMs can be downloaded from `https://sli.ics.uci.edu/~ihler/uai-data/index.html`. These are used for experiments related to the PGM models.

Note that these are cited/anonymous dataset repository links. The datasets are publicly available and can be downloaded from the respective links.

## Usage

We provide the usage instructions for different scripts in their corresponding folders -

1. [ITSELF, GUIDE, SSMP](methods/ITSELF-GUIDE/readme.md)
2. [Hill Climbing Search](methods/hill_climbing_search/readme.md)

### Steps to Reproduce Results -

1. Download the datasets/models by using the links given in the dataset section.
2. Install the required packages using conda.
3. Utilize the trained PC, MADE and MN models located in the `trained_models` directory.
4. To train the neural-based models with SSMP (baseline), GUIDE and ITSELF, use the provided scripts found in `methods/ITSELF-GUIDE/`. Ensure you provide the dataset and trained model paths. A more thorough explanation is provided in Readme in the `methods/ITSELF-GUIDE/` directory.
5. To obtain log-likelihood scores for both the `polytime baseline methods` and `Stochastic Hill Climbing Search`, employ the scripts in `methods/hill_climbing_search`.
6. We kindly request that you incorporate the paths to the trained models and datasets within the scripts.
