line_break

The purpose of this package is to take electronic health records (EHR) from Epic that has lost (through no fault of its own) all of the new lines that indicate the structure of the record, and predict the best place for those new lines to go.

It has been designed to run with a single call to line_break.sh and this is the usage:

Usage: $ ./line_break.sh FOLDER_WHERE_RECORDS_LIVE NAME_OF_MODEL_YOU_WILL_CREATE TEMPLATE_YOU_CREATED

The script will then begin by taking that set of text-only files and extracting the word features from it and outputting those features in the format required by CRF++ (more on CRF++ here http://taku910.github.io/crfpp/) - it will save the training features in a file it creates called train_features and the test feature files will be saved separately with a naming scheme based on the original (hopefully unique) names of those files.

The next step it will do is take those training features, the template you passed to it, and call CRF++'s train command and create you a model. It'll show you its progress as it goes along.

After that model has been created, the script will use crf_test and the model to predict new labels. It outputs the same table-format that it requires as input with the additional column for the predicted label.

The next step will be to take those predicted labels and use them to recreate the text-based files that will hopefully look like the EHR's original format and also tabulate the results. It can optionally also create html versions of those files with highlighting to show you where the model failed and succeeded.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
10_folds		10_folds
line_break.git		line_break.git
templates		templates
.gitignore		.gitignore
README.md		README.md
batch_line_break.sh		batch_line_break.sh
batch_shell_generator.py		batch_shell_generator.py
bfg-1.12.3.jar		bfg-1.12.3.jar
char_templaterizer.py		char_templaterizer.py
crf_formatter.py		crf_formatter.py
crf_formatter.pyc		crf_formatter.pyc
line_break.sh		line_break.sh
line_break_ex.sh		line_break_ex.sh
shell_test_model		shell_test_model
table_to_txt.py		table_to_txt.py
template_file		template_file
templaterizer.py		templaterizer.py
tenfoldshuffle.py		tenfoldshuffle.py
test		test
test_shell_generator.py		test_shell_generator.py
tester.sh		tester.sh
text_to_test_file_format.py		text_to_test_file_format.py
training_and_testing_data_prep.py		training_and_testing_data_prep.py
word_model		word_model
word_model_04_1		word_model_04_1
word_model_tiny_train		word_model_tiny_train
word_template		word_template
word_templaterizer.py		word_templaterizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

line_break

About

Releases

Packages

Languages

allisons/line_break

Folders and files

Latest commit

History

Repository files navigation

line_break

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages