Skip to content

Commit 71ba32c

Browse files
committed
dataset updates
1 parent 03ef8de commit 71ba32c

File tree

9 files changed

+1013
-828
lines changed

9 files changed

+1013
-828
lines changed

README.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,9 @@ The ID-NF method estimates the ID by analyzing how the singular values of the fl
1414
To estimate the ID of your own data, you need to train $N$ NFs and calculate the singular values on $K$ training examples. Any NF implementation can be used. For instructions for using the same NFs as in the original paper, see the README.md descriptions within the folders [images](images) or [vector data](vectors_data), respectively. Once $N$ NFs are trained for $\sigma_1,\dots,\sigma_D$ and the singular values are calculated on $x_{1},\dots,x_{K}$, the ID can be estimated using the estimate_d function in [estimate_d/utils.py](estimate_d/utils.py), see the documentation of that function for details. We provide a dummy code which can serve as a blueprint for your data [estimate_d/estimate_d.py](estimate_d/estimate_d.py).
1515

1616
### Reproducibility
17-
In [estimate_d/toy_experiements](estimate_d/toy_experiements) and [estimate_d/OOD_experiements](estimate_d/OOD_experiements) we provide code and instructions for the toy experiments and OOD_experiments of the original paper.
17+
In the paper, we tested our method on various toy and real data. For the image experiments on CelebA and on the Style gan image manifolds, please see instructions in [images](images). Once the singular values are given, the out-of-distribution experiments can be reproduced using [estimate_d/OOD_experiements](estimate_d/OOD_experiements).
18+
19+
For the manifolds and distributions displayed in Table 1, see [vector_data/toy_experiments](vector_data/toy_experiments). There you will find also the instructions for reproducing the experiments on the lolipop dataset and on S(D/2).
1820

1921
### Acknowledgement
2022
For image data, we used the [Manifold Flow repository](https://github.com/johannbrehmer/manifold-flow) which is based on the [Neural Spline code base](https://github.com/bayesiains/nsf). For vector data, we used the [Inflation-Deflation repository](https://github.com/chrvt/Inflation-Deflation) which is based on this [Block Neural Autoregressive Flow implementation](https://github.com/kamenbliznashki/normalizing_flows).

images/experiments/datasets/images.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ def __init__(self, noise_type, sig2, scale_factor):
113113
scale_factor=scale_factor,
114114
n_bits=8,
115115
random_horizontal_flips=False,
116-
gdrive_file_ids={"grid": "12QvzFg9ln9bXvdP1nUGPWqHVqGCBFodR", "train": "1Plel_nOIYUu3E-KKDJ9-yVWPp5HcaGFo", "test": "17NOhkhctMkPWvLOzR5L0WOYxAFlUebjd"},
116+
gdrive_file_ids={"grid": "1HZvfuOkaXqIMtPX_hpCp3zzHrL3JtEUb", "train": "1Plel_nOIYUu3E-KKDJ9-yVWPp5HcaGFo", "test": "17NOhkhctMkPWvLOzR5L0WOYxAFlUebjd"},
117117
) # For the 2D demo we don't want random flips, as they would essentially create a second disjoint manifold
118118

119119
def latent_dim(self):
@@ -130,8 +130,8 @@ def __init__(self, noise_type, sig2, scale_factor):
130130
n_bits=8,
131131
random_horizontal_flips=False,
132132
gdrive_file_ids={
133-
"x_train": "1DayM2MLczvmck9Nfdv_c5oYOD6GbRkGj",
134-
"x_test": "1gJuOOm9x5sNytuKmYqZ83-ieicfJsXj5",
133+
"train": "1TsrUWSCcuRjOsUSpCn_HvlzC-Cz3HOKS",
134+
"test": "1gJuOOm9x5sNytuKmYqZ83-ieicfJsXj5",
135135
"params_train": "1MmIAfT2uvAC7fuC92KxNRQJUAUxsnXZr",
136136
"params_test": "1day5UUZBRxAfvQsmKxir8KL1RAYbsIY9",
137137
},

vector_data/README.md

+11-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
### Instructions for vector data
22

3-
To estimte the ID of your vector dataset, use
3+
To estimate the ID of your vector dataset, use
44

55
1. [my_vector_data_cluster.py](my_vector_data_cluster.py) which trains N NFs and calculates the singular values on K samples,
66
2. [estimate_d/estimate_d.py](estimate_d/estimate_d.py) which estimates the ID given the singular value evolution.
@@ -29,3 +29,13 @@ To estimte the ID of your vector dataset, use
2929
+ Follow the instructions in the file [estimate_d/estimate_d.py](estimate_d/estimate_d.py)
3030

3131
In case you have problems with setting up your data, do not hesitate to contact us.
32+
33+
34+
## Details for toy data
35+
36+
+ to train the toy examples of Table 1, we use the [Inflation-Deflation repository](https://github.com/chrvt/Inflation-Deflation). For convenience, we added the datasets into this repository. Use [vector_data/train_flow_toy.sh](vector_data/train_flow_toy.sh). Note, however, in the original paper we have used a non-equidistant sigma range as we did not have to re-train the models. In fact, we used sigmas = [0,1e-09, 5e-09, 1e-08, 5e-08, 1e-07, 5e-07,1e-06,5e-06,0.00001,0.00005,0.0001,0.0005,0.001,0.005,0.01,0.05,0.1,0.25,0.5,1.0,2.0, 3.0, 4.0, 6.0 , 8.0, 10.0]. This should, however, not lead to different results.
37+
38+
+ for the lollipop and S(D/2) experiments, please use [vector_data/train_flow_lolipop.sh](vector_data/train_flow_lolipop.sh) and [vector_data/train_flow_sphere.sh](vector_data/train_flow_sphere.sh), respectively
39+
40+
41+
In case you have problems with reproducing our results, do not hesitate to contact us.

0 commit comments

Comments
 (0)