# Code for --- Diffeomorphic interpolation for efficient persistence-based topological optimization

Submission at NeurIPS 2024. Do not distribute.

# Repo organisation

The important code for the implementation is gathered in the `./utils/` repository. 
The `./autoencoder/` repository contains the code to reproduce the main experiment of Section 4, 
though it requires to download the COIL-20 dataset, for instance [here](https://www.cs.columbia.edu/CAVE/software/softlib/coil-20.php). 
Note also that it requires more extensive hardware (typically a GPU, with Cuda properly set up). 

## Dependencies

The implementation is based on:
- `tensorflow 2.4.1`
- `Gudhi 3.8.0`
- `scikit-learn 1.3.0`
- `numpy 1.24.2`
- `matplotlib 3.5.1`
- `oineus`, that has been built from source following https://github.com/anigmetov/oineus/tree/master . 


_Remark:_ As far as `oineus` is concerned, we cloned and built the `commit aa8016b0d58aaec246901df7f51e6abda191224b` (6 Sept 2023). 
It is not impossible that the current version of `oineus` 
produces different results (hopefully some improved running time!). 
When making our code publicly available, we will take the necessary care to
integrate an updated version of `oineus`. 

_Remark:_ If you want to use our code without testing `oineus`, 
this should be doable (you will just have a warning message telling you that `oineus` is not installed, 
and of course the option `use_oineus` should be kept as `False`). 

# Quick start

We propose the following quick start, that showcases the main use of the `DeformRipsLayer` class. 
In that example, we build four different types of `DeformRipsLayer` objects, which essentially play 
with the boolean variables `use_deformation` and `use_oineus`. 
In any case, once instanced, calling the `DeformRipsLayer` on a point cloud `X` will compute its usual Rips
persistence diagram (in this example, for homology dimension 1). 
The difference lies in what we define to be its _gradient_ (which are automatically computed and incorporated in 
autodiff pipelines).

- What we call `vanilla` refers to `use_deformation, use_oineus = False, False`, and the gradient is the usual gradient (that is very sparse usually).
- Setting `use_deformation=True` means that we replace the vanilla gradient by its diffeomorphism counterpart, obtaining a less sparse and much smoother object (essentially convolution by a Gaussian kernel).

Once instanced, you can compute the diagram of a point cloud `X` (represented by a `np.array`) 
simply by setting `dgm = DRL(X)`. 
Note that the diagram you compute should be the same no matter the value you assign to `use_deformation` or `use_oineus`. 


```python
from utils.DRL import DeformRipsLayer
from utils.expe import sample_circle

import matplotlib.pyplot as plt
import tensorflow as tf
from time import time

Xinit = sample_circle(n_points=100)
X = tf.Variable(initial_value = Xinit, trainable=True)

DRL_vanilla = DeformRipsLayer(homology_dimension=1, max_edge_length=3, use_deformations=False, sigma=None, input_dimension=2, subsample_size=None, use_oineus=False)
DRL_deform  = DeformRipsLayer(homology_dimension=1, max_edge_length=3, use_deformations=True , sigma=0.5, input_dimension=2, use_oineus=False)

D=2

def topoloss(dgm):
    """
    An example of topological loss function. 
    """
    return tf.math.reduce_sum(tf.square(0.5 * (dgm[:, 1] - dgm[:, 0])))

with tf.GradientTape() as tape:
    t1 = time()
    dgm_vanilla = DRL_vanilla(X)
    loss_vanilla = topoloss(dgm_vanilla)
    grad_vanilla = tape.gradient(loss_vanilla, [X])
    t2 = time()
    print("time to compute vanilla: %.3f sec." %(t2 - t1))
with tf.GradientTape() as tape:
    dgm_deform = DRL_deform(X)
    loss_deform = topoloss(dgm_deform)
    grad_deform = tape.gradient(loss_deform, [X])
    t3 = time()
    print("time to compute deform: %.3f sec." %(t3 - t2))

gradients = [grad_vanilla[0], grad_deform[0]]
names = ["Vanilla gradient", "Diffeo gradient"]

fig, axs = plt.subplots(1, 2, figsize=(20, 5))
for ax, grad, name in zip(axs, gradients, names):
    ax.scatter(Xinit[:,0], Xinit[:,1])
    ax.quiver(Xinit[:,0], Xinit[:,1], -grad[:,0], -grad[:,1], angles='xy', scale_units='xy', scale=3, color='red')
    ax.set_title(name)
    ax.grid()
```


# Reproduce the subsampling experiments

If you want to reproduce the experiment displayed in Appendix B (scaling Carriere et al., 2021 experiment), 
you can run the following code. 

This PoC experiment showcases how the use of subsampling + diffeomorphism enables scaling 
topological optimization.  

```python
import numpy                 as np
import gudhi                 as gd

import matplotlib.pyplot as plt


import utils.expe as u
import utils.losses as ul
print("Version of Gudhi :", gd.__version__, "(should be 3.8 or better)")

N = 2000
np.random.seed(1)
X_init = np.random.uniform(low=-1, high=1, size=(N, 2))

fig, logs = u.benchmark(Xinit=X_init, 
                  loss_function = ul.maxpers,
                  sigma=0.1,
                  subsample_size=100,
                  learning_rate = 0.05,
                  threshold_loss=None,
                  n_epoch = 750)
```

The Stanford bunny experiment can be reproduced running the following code. 
The bunny is stored as `stanford-bunny.npy`, this files contains only the vertices of the original `obj` file.

**Beware:** if you run it for 1000 epochs (as done in the paper), it takes about 10 hours to complete on a standard laptop CPU. 
You can set the number of epoch to 100 and already see the drastic difference between the Vanilla flow and the Diffeo one.

```python
import utils.expe as u
import matplotlib.pyplot as plt
import numpy as np

logs_vanilla, logs_diffeo = u.bunny_expe('stanford-bunny.npy', 
             n_epoch=1000,  # reduce this to decrease running time!
             sigma=0.05, 
             subsample_size=100)


### PLOT 
X_init = np.load('stanford-bunny.npy')

i = -1
Y = logs_vanilla['evolution_X'][i]
Y2 = logs_diffeo['evolution_X'][i]
Y3 = logs_diffeo['evolution_X'][200]

fig = plt.Figure(figsize=(20, 10))

ax = fig.add_subplot(141, projection='3d')
ax.scatter(X_init[:,0], X_init[:,1], X_init[:,2], alpha=0.01, c='blue')
ax.set_title("Initialization")
ax.set_axis_off()
ax.view_init(elev=90, azim=-90)

ax = fig.add_subplot(142, projection='3d')
ax.set_title("Vanilla (epoch 1000)")
ax.scatter(Y[:,0], Y[:,1], Y[:,2], alpha=0.01, c='black', label='Vanilla')
ax.set_axis_off()
ax.view_init(elev=90, azim=-90)

ax = fig.add_subplot(143, projection='3d')
ax.scatter(Y3[:,0], Y3[:,1], Y3[:,2], alpha=0.01, c='orange', label='Diffeo')
ax.set_title("Diffeo (epoch 200)")
ax.set_axis_off()
ax.view_init(elev=90, azim=-90)

ax = fig.add_subplot(144, projection='3d')
ax.scatter(Y2[:,0], Y2[:,1], Y2[:,2], alpha=0.01, c='orange', label='Diffeo')
ax.set_title("Diffeo (epoch 1000)")
ax.set_axis_off()
ax.view_init(elev=90, azim=-90)
```

# Bonus 

Two animated `gif` are provided, showcasing the vanilla flow and the diffeomorphic flow over iteration when collapsing a circle.