lipid_classifier

Lipid classification tool

#Usage#

##Installation bundle install should do most of the work for you, however, it currently references unreleased code in Rubabel. I use that as a local git repository. Due to the large file size of the DB, it will take some additional preparatory work to prepare this. I'll have instructions posted on Rubabel after I figure that step out. I expect to make that file available via Dropbox for local caching.

Generating analysis WEKA files

from your root directory run ./bin/write_arffs to see the options

Typical usage might include these options:

-m N where N is the number of threads
-d FOLDER where folder is the directory (it will make it for you) where you want to generate the files
-r if you want to clean out that folder, ie you are iterating in place.
-f input.yml where you provide the input file in YAML format containing the classifications

So, if I have all_lmids.yml' in the root, and have 6 cores, I might run: ./bin/write_arffs -f all_lmids.yml -d all -m 6`

Or, I might want to rerun that analysis on one core ./bin/write_arffs -f all_lmids.yml -d all2

Corrected LMIDS won't change the LMID in the output, but does change the classification used by WEKA. These are loaded from the hash contained in corrections.yml.

Analyzing files with WEKA

Again, from the root directory ./bin/classify_lipids will show you the options.

Typical usage examples:

-d FOLDER which is the FOLDER or directory where write_arffs placed its output files
-l list.txt where list.txt is a file containing one LMID per line, for batch analysis of LMIDS in the classifier
--lmids LMFA01010001,LMST02040023,LMSL05010014 where you can provide a list of LMIDS for classification
-t will add timing outputs to the analysis.
--run_weka is required to generate a new WEKA analysis. Otherwise, the analysis will just pull existing classifcation information from the directory. You must do this the first time you run classify_lipids on a new directory.

So, to analyze a list of lmids in lmids.txt, working off the generated analyses (all, all2) from the previous section run: ./bin/classify_lipids -d all -l lmids.txt -t --run_weka

Repeat that for the other analysis with: ./bin/classify_lipids -d all2 -l lmids.txt -t --run_weka

Now, if you want to run a new list of LMIDS instead of the other one, you can skip the run_weka option: ./bin/classify_lipids -d all -l lmids.txt -t

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
bin		bin
lib		lib
spec		spec
supplemental_materials		supplemental_materials
to_fix		to_fix
.gitignore		.gitignore
.jrubyrc		.jrubyrc
.rspec		.rspec
.travis.yml		.travis.yml
Gemfile		Gemfile
LICENSE		LICENSE
LM_TSVS.tgz		LM_TSVS.tgz
README.md		README.md
Rakefile		Rakefile
VERSION		VERSION
all_lmids.yml		all_lmids.yml
amino_acids.yml		amino_acids.yml
archive.tgz		archive.tgz
automate_creating_graphs.rb		automate_creating_graphs.rb
check_line.rb		check_line.rb
cleanup_folder.rb		cleanup_folder.rb
comp.rb		comp.rb
compare_classifications.rb		compare_classifications.rb
corrections.yml		corrections.yml
create_training_set_from_lmid_list.rb		create_training_set_from_lmid_list.rb
file_output.tgz		file_output.tgz
find_errors_in_arff.rb		find_errors_in_arff.rb
find_unused_smarts.rb		find_unused_smarts.rb
lipid_classifier.gemspec		lipid_classifier.gemspec
lipidbank.tgz		lipidbank.tgz
lmid_vs_weka.tgz		lmid_vs_weka.tgz
parse_WEKA_classifier_into_branches.rb		parse_WEKA_classifier_into_branches.rb
parse_lmsd_to_yaml.rb		parse_lmsd_to_yaml.rb
pdfs.tgz		pdfs.tgz
script_WEKA_analysis.rb		script_WEKA_analysis.rb
short.yml		short.yml
smart_search_strings.yml		smart_search_strings.yml
smarts_check_smiles.yml		smarts_check_smiles.yml
standards_set.tsv		standards_set.tsv
test.yml		test.yml
test_smiles		test_smiles
testing_yaml.rb		testing_yaml.rb
testset.yml		testset.yml
thread-pool.rb		thread-pool.rb
trainingset.yml		trainingset.yml
trainingset_cleaned.yml		trainingset_cleaned.yml
write_comparison_sets.rb		write_comparison_sets.rb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lipid_classifier

Generating analysis WEKA files

Analyzing files with WEKA

About

Releases

Packages

Contributors 3

Languages

License

princelab/lipid_classifier

Folders and files

Latest commit

History

Repository files navigation

lipid_classifier

Generating analysis WEKA files

Analyzing files with WEKA

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages