-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reads filtration change seq length + RF model update #5
base: controlled_shuffles
Are you sure you want to change the base?
Conversation
IgOmeProfiling_pipeline.py
Outdated
@@ -42,7 +42,7 @@ def run_pipeline(fastq_path, barcode2samplename_path, samplename2biologicalcondi | |||
|
|||
module_parameters = [fastq_path, first_phase_output_path, first_phase_logs_path, | |||
barcode2samplename_path, left_construct, right_construct, | |||
max_mismatches_allowed, min_sequencing_quality, first_phase_done_path, | |||
max_mismatches_allowed, min_sequencing_quality, minimal_length_required,first_phase_done_path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mismatch with the definition in read_filtration/module_wrapper.py.
You define that as a named parameter (starts with --) and pass it here as positional parameter.
Also the order is incorrect/doesn't match, you are passing minimal length as done path.
To summarize, this change is wrong and doesn't work
model_fitting/random_forest.py
Outdated
|
||
def get_hyperparameters_grid(seed): | ||
# Number of trees in random forest | ||
n_estimators = [int(x) for x in np.linspace(start=100, stop=2000, num=20)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set parameters using command arguments, parameters shouldn't be hardcoded.
This remark should be applied to entire file, not just this line
model_fitting/random_forest.py
Outdated
for i in range(num_of_configurations_to_sample): | ||
configuration = {} | ||
for key in hyperparameters_grid: | ||
configuration[key] = np.random.choice(hyperparameters_grid[key], size=1)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It this seeded? Would we get same results every experiment run?
model_fitting/random_forest.py
Outdated
data.drop(['sample_name', 'label'], axis=1, inplace=True) | ||
# a matrix of the actual feature values | ||
X_train = data[train_rows_mask].values | ||
X_test = data[test_rows_mask].values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no usage in (modified) code - not X_test and not Y_test
Tool all data rf
Rf parallel
Validation files
…s/IgomeProfiling into change_place_stop_machines
change the place of file AWS stop machines from tools to auxiliaries
fix the script by the new requests
change the path of the wsl tutorial
Order of sort motifs bc
Connect positive motifs to pipeline
Flag num sample build cluster
add new script for summary reads in one csv file
Join samples to groups
The motif samples were by sort_by_num_samples, sort_by_unique_memebers, sort_by_cluster_size now its sort_by_num_samples, sort_by_cluster_size , sort_by_unique_memebers when unique members goes from low to high
changed the order of the samples
fixed bug of biological condition type value
Fix unite motifs
No description provided.