<h2 align="center">
  Separations in the Representational Capabilities of Transformers and Recurrent Architectures
</h2>


#### Dependencies

- Compatible with python 3
- Setup an environment with PyTorch 2.1.2
- Install Dependencies in `approx-seq-code/requirements.txt`

#### Usage

The relevant source code is in the `approx-seq-code/src` directory. The set of command line arguments available can be seen in the respective `approx-seq-code/src/args.py` file. The list of tasks and architectures can be found in the beginning of `approx-seq-code/src/args.py`.

**Run Transformer on Positional Retrieval**

To run a two-layer Transformer model on the Positional Retrieval task on input lengths 50. At `approx-seq-code/`, run:

```shell
$	python -m src.main -name pos_ret_run -model san -task index -length 50 -train_steps 10000 -n_layer 2 -n_embd 1024 -batch_size 64 -learning_rate 0.0001 -gpu 0
```

To run the Mamba model on the Dyck-2 task, you can replace the the `-task` argument with `dyck2` and the `model` argument with `mamba`.


