-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Complex annotation #305
Complex annotation #305
Conversation
@cheng-yu-zhang For each pull request, please summarize the detailed work that you have done so that it will be easier for other people to review it. |
@hongzhonglu I haved added more details into the comments. |
Hi @cheng-yu-zhang, Thanks for this update! Nice work! The growth test for the updated model basically remains the same with model in the devel branch. The accuracy for gene essential test also remains the same (0.89). However, two genes: You mentioned you added 7 new genes, but according to the README file, the gene number has been changed from 1150 to 1161. Please check this. It would be better to have a reference or a database reference for every change so that we can trace back to the annotation. This could either be an extra column of "databasenewGPR.tsv" or summaries as a table here (see below for example). It would facilitate the transparency of the model curation. @edkerk @hongzhonglu, what do you think? For example:
|
@feiranl There should indeed be an explanation of why these curations were performed. The PR text mentions that these were manually curated by looking at different databases, but which database is then suggesting which change? Do the databases agree? Is there a conflict? Also some genes are removed, how confident are we of this? I have rebased this PR onto the latest Instead of modifying existing files that were used for previous curations ( |
@edkerk Instead of making a new file "DBnewRxnsGenes.tsv“, which detailed the new genes, could I add another file, maybe named "databasenewGPR_proof.tsv", to explain why these curations were performed?
|
# Conflicts: # model/dependencies.txt # model/yeast-GEM.txt # model/yeast-GEM.yml
(cherry picked from commit bbaefdb) # Conflicts: # code/modelCuration/addTransNewGPR.m # data/modelCuration/TransRxnNewGPR.tsv
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also run [accurancy,tp,tn,fn,fp] = essentialGenes(model);
and compared this with yeast-GEM 8.6.0
Metric | yeast-GEM 8.6.0 | this PR |
---|---|---|
Number of genes | 1151 | 1162 |
Accuracy | 0.8801 | 0.8802 |
TP | 923 | 930 |
TN | 61 | 62 |
FP | 98 | 97 |
FN | 36 | 38 |
Two new false negatives (model predicts their essentiality, but experimental data indicates that they are not essential): YKR072C and YOR054C. Both are in reaction r_0906: H+[c] + N-[(R)-4-phosphonopantothenoyl]-L-cysteine[c] => carbon dioxide[c] + pantetheine 4'-phosphate[c]
(part of coenzyme A biosynthesis).
The old grRule: (YKL088W and YKR072C and YOR054C) or (YKL088W and YKR072C) or (YKL088W and YOR054C) or YKL088W
The new grRule: YKL088W and YKR072C and YOR054C
The links (SGD (YKL088W), (YKR072C), (YOR054C), Complex portal, and Uniprot (YKL088W), (YKR072C), (YOR054C) all tend to agree that they form a complex, though. so this change to the model should be approved, evenwhile the FN goes up slightly.
Note: essentialGenes
currently needs RAVEN from SysBioChalmers/RAVEN#421
Could also run the Growth Tests? This normally will run successfully, but just to make sure that we have a functional model? @hongzhonglu @edkerk @cheng-yu-zhang I think maybe it is time to have some more tests after each update to ensure the quality. Now we have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I can see you have fixed some long-existing issues about complex annotations especially the wrong gene annotation for r_0943
. Nice work! Please resolve the questions and then the pull request is ready to go.
It is very nice suggestion. More test will make sure the model prediction quality is increased consistently. @cheng-yu-zhang @feiranl |
# Conflicts: # README.md # code/modelCuration/addDBNewGeneAnnotation.m # model/yeast-GEM.xml # model/yeast-GEM.yml
Main improvements in this PR:
Manually check all 209 complex annotations in yeast8.5 based on uniport, SGD and complex portal. I applied "addDBNewGeneAnnotation.m" to correct 45 complex annotations which are wrong or incomplete.
The explanation is in file "explanation.docx"Explanation
Yeast_complex_portal_2022.tsv is the latest complex information downloaded from complex portal. This file and complex portal website are the most import reference, and uniprot and SGD is for supplement.
I hereby confirm that I have:
develop
as a target branch (top left drop-down menu)