Generate Synthetic dataset. #88

RishabGoel · 2021-12-31T16:42:24Z

No description provided.

dbieber · 2021-12-31T16:47:13Z

scripts/setup-tpu.sh

@@ -5,6 +5,8 @@ git clone https://[email protected]/googleprivate/compressive-ip



Don't merge this file.

dbieber · 2021-12-31T16:48:53Z

scripts/process_cfp_raise.py

@@ -53,19 +57,31 @@ def generate_dataset(
            else:
              test_file_writer.write(record_bytes)

+def get_target_index(target, keep_errors_only):
+  error_idx_offset = 1 if keep_errors_only else 1000


Q: Is it OK having error indexes hardcoded here, or will it make maintenance hard later?

dbieber · 2021-12-31T16:49:11Z

core/distributed/sweep.py

@@ -180,18 +180,19 @@ def main(experiment_id=None, study_id=None, dataset_path=None, skip_create=False
  if experiment_id is None:


merge conflict. don't check in.

- Adds edge_* features to dataset, 6 edge types - Sweeps - GGNN implementation (tests timing out still) - Adds run_test (option for no subsampling in run_test) to eval 1 epoch - Adds inspect_edges to analyze_data

A dry-run sweep generates the commands for the sweep without running them. This is useful for resuming old runs on different machines than they were originally run on, or for resuming just a subset of old runs.

Allows setting number of training steps and seed, so we can run multiple runs of a single model to compute variance of the metrics. Colab for generating commands is here: https://colab.research.google.com/drive/1axwI8dGJ1_wTLIKJsLEx0FazHaPXEu72#scrollTo=IroRFMZyl6kR&uniqifier=3

- Configs for GGNN: - config.ggnn_use_fixed_num_layers = True - config.ggnn_layers = 3 - new dataset with edge info for GGNN - generates edge_sources_shape on the fly - We're filtering the same examples as before - Sweep for ggnn experiments - Overwriting top checkpoints after preemption (better would be a new checkpoints dir) to avoid failure on restart - Supports both fixed num layers and num_steps num layers for ggnns - Code for generating sampled test set with roughly equal error and no error examples

The sgd optimizer state has changed, so our naive existing method of loading old checkpoints doesn't always work. This works around that for test. The restore logic now skips init (was unnecessary and slow anyway), loads the old checkpoint state, but then only keeps the params, dropping opt_state. Also in this commit: the ability to restore from an LSTM into an Exception IPA-GNN or regular IPA-GNN. To do this, set --config.finetune=LSTM

Merged assert error generation

dbieber and others added 11 commits October 27, 2021 13:04

Data generation

ead2c68

added the function for assertion error

356b9cd

fixed a bug

eac4d07

generate data for training

a351c69

Data generation

d0a26c9

added the function for assertion error

db39389

fixed a bug

0621ae1

rebased branch

773d5d0

added speciifc dataset handling

5136ddf

added the run configs

9e1fd39

removed a bug

ec94a0d

dbieber approved these changes Dec 31, 2021

View reviewed changes

RishabGoel and others added 18 commits January 3, 2022 09:44

changed config

844225d

changed filepath

3b3abbc

reveted the create dir

ecb3cec

resolved merge conflict

fb8f205

GGNN baseline (#87)

9fb6a39

- Adds edge_* features to dataset, 6 edge types - Sweeps - GGNN implementation (tests timing out still) - Adds run_test (option for no subsampling in run_test) to eval 1 epoch - Adds inspect_edges to analyze_data

Enable dry-run sweeps. (#89)

a4d0fcf

A dry-run sweep generates the commands for the sweep without running them. This is useful for resuming old runs on different machines than they were originally run on, or for resuming just a subset of old runs.

Data generation

bb5b77f

added the function for assertion error

1435e5f

fixed a bug

4736076

Data generation

a8365c7

Merged assert error generation

added the function for assertion error

6080a69

generate data for training

5213ee8

added speciifc dataset handling

4e7b0df

added the run configs

063156f

removed a bug

29b19c8

RishabGoel added 7 commits January 15, 2022 03:45

changed config

2a5f3f4

changed filepath

ac5fd95

reveted the create dir

56e0f1d

resolved merge conflict

6c5d41a

resolved merge conflicts

66382e2

added correct paths

ac2fce5

debugged directory creation

5419d25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate Synthetic dataset. #88

Generate Synthetic dataset. #88

RishabGoel commented Dec 31, 2021

dbieber Dec 31, 2021

dbieber Dec 31, 2021

dbieber Dec 31, 2021

		@@ -5,6 +5,8 @@ git clone https://[email protected]/googleprivate/compressive-ip

		@@ -180,18 +180,19 @@ def main(experiment_id=None, study_id=None, dataset_path=None, skip_create=False
		if experiment_id is None:

Generate Synthetic dataset. #88

Are you sure you want to change the base?

Generate Synthetic dataset. #88

Conversation

RishabGoel commented Dec 31, 2021

dbieber Dec 31, 2021

Choose a reason for hiding this comment

dbieber Dec 31, 2021

Choose a reason for hiding this comment

dbieber Dec 31, 2021

Choose a reason for hiding this comment