Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

print selene_sdk version, add config and model file to output, add ra… #191

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 38 additions & 4 deletions selene_sdk/utils/config_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,16 @@
from time import strftime
import types
import random
import shutil, yaml
ygliu2016 marked this conversation as resolved.
Show resolved Hide resolved

import numpy as np
import torch

from . import _is_lua_trained_model
from . import instantiate
from . import load_path

from selene_sdk import version

def class_instantiate(classobj):
"""Not used currently, but might be useful later for recursive
Expand Down Expand Up @@ -111,6 +114,7 @@ def initialize_model(model_configs, train=True, lr=None):

module = None
if os.path.isdir(import_model_from):
import_model_from = import_model_from.rstrip(os.sep)
module = module_from_dir(import_model_from)
else:
module = module_from_file(import_model_from)
Expand Down Expand Up @@ -260,9 +264,11 @@ def parse_configs_and_run(configs,

Parameters
----------
configs : dict
The dictionary of nested configuration parameters. Will look
for the following top-level parameters:
configs : str or dict
If it is a str, then configs is the name of the configuration YAML file, from which we will read
nested configuration parameters.
If it is a dict, then configs is a dict storing nested configuration parameters.
Will look for the following top-level parameters:

* `ops`: A list of 1 or more of the values \
{"train", "evaluate", "analyze"}. The operations specified\
Expand Down Expand Up @@ -305,8 +311,19 @@ def parse_configs_and_run(configs,
to the dirs specified in each operation's configuration.

"""
if isinstance(configs, str):
configs_file = configs
if not os.path.isfile(configs_file):
print("The configuration YAML file {} does not exist!".format(configs_file))
return
configs = load_path(configs_file, instantiate=False)
operations = configs["ops"]

#print selene_sdk version
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented out print statement

if "selene_sdk_version" not in configs:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe simplify this to version. I think it's also OK to assume that version is never specified in the config and to add it automatically, e.g. moving from selene_sdk import version to the top of the page. I think you can also directly do import selene_sdk and use selene_sdk.__version__ -- not sure which is better practice.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about name "selene_sdk_version" vs "version". May want to keep the longer version at this time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I thought about it more, and I think it's better to have this be automatically populated? Let's just remove lines 322 and 323 outright.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please adjust based on comments above, I think it makes sense to just auto-populate since almost no one will try to specify it manually in their config

configs["selene_sdk_version"] = version.__version__
print("Running with selene_sdk version {0}".format(version.__version__))

if "train" in operations and "lr" not in configs and lr != None:
configs["lr"] = float(lr)
elif "train" in operations and "lr" in configs and lr != None:
Expand All @@ -331,8 +348,9 @@ def parse_configs_and_run(configs,
if "create_subdirectory" in configs:
create_subdirectory = configs["create_subdirectory"]
if create_subdirectory:
rand_str = str(random.random())[2:]
current_run_output_dir = os.path.join(
current_run_output_dir, strftime("%Y-%m-%d-%H-%M-%S"))
current_run_output_dir, '{}-{}'.format(strftime("%Y-%m-%d-%H-%M-%S"), rand_str))
os.makedirs(current_run_output_dir)
print("Outputs and logs saved to {0}".format(
current_run_output_dir))
Expand All @@ -343,9 +361,25 @@ def parse_configs_and_run(configs,
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
#torch.backends.cudnn.deterministic = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these commented out?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is your code, which is not included in the main repository of selene. I am not sure if the code is necessary and thus copied here but commented them. It seems to me that with a seed, there is already deterministic in it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification. Please un-comment this as we need it for ensuring all aspects are deterministic.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with the commented out line in 365, if that is my code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove the commented out lines in 364 and 365?

#torch.backends.cudnn.benchmark = False
print("Setting random seed = {0}".format(seed))
else:
print("Warning: no random seed specified in config file. "
"Using a random seed ensures results are reproducible.")

if current_run_output_dir:
# write configs to output directory
with open('{}/{}'.format(current_run_output_dir,'configs.yaml'), 'w') as f:
yaml.dump(configs, f, default_flow_style=None)
# copy model file or directory to output
model_input = configs['model']['path']
if os.path.isdir(model_input): # copy the directory
shutil.copytree (model_input,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spacing here has not been fixed still. Please adjust so it adheres to the current style in the codebase, thank you!

os.path.join(current_run_output_dir, os.path.basename(import_model_from)),
dirs_exist_ok=True)
else:
shutil.copy (model_input, current_run_output_dir)
ygliu2016 marked this conversation as resolved.
Show resolved Hide resolved


execute(operations, configs, current_run_output_dir)
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@
"scipy",
"seaborn",
"statsmodels",
"torch>=0.4.1, <=1.9",
"torch>=0.4.1, <=1.11",
],
entry_points={
'console_scripts': [
Expand Down
2 changes: 1 addition & 1 deletion tutorials/getting_started_with_selene/simple_train.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,6 @@ train_model: !obj:selene_sdk.TrainModel {
}
random_seed: 1447
output_dir: ./training_outputs
create_subdirectory: False
create_subdirectory: True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this was changed because of testing - can this be changed back since it's not relevant to the PR?

load_test_set: False
...