-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: incorrect metabolite annotation #108
Comments
Due to a bug in yeast-GEM (SysBioChalmers/yeast-GEM#108), incorrect metabolite annotations were introduced. Yeast-GEM derived annotations are removed, while KEGG-derived annotations are kept. Will be updated once bug in yeast-GEM is fixed.
@edkerk Hi ,which version of model you used for the check? Just have a quick check, these are annotations for s_2807 and s_0045, which are right in the latest version:
|
@hongzhonglu As specified in the bottom of the first post, this was in the Excerpt from the <species metaid="s_2807__91__erm__93__" id="s_2807__91__erm__93__" name="(S)-3-hydroxyhexacosanoyl-CoA [endoplasmic reticulum membrane]" compartment="erm" hasOnlySubstanceUnits="false" boundaryCondition="false" constant="false" fbc:charge="0" fbc:chemicalFormula="C47H86N7O18P3S">
<annotation xmlns:sbml="http://www.sbml.org/sbml/level3/version1/core">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#" xmlns:vCard4="http://www.w3.org/2006/vcard/ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/">
<rdf:Description rdf:about="#s_2807__91__erm__93__">
<bqbiol:is>
<rdf:Bag>
<rdf:li rdf:resource="http://identifiers.org/chebi/CHEBI:15354"/>
<rdf:li rdf:resource="http://identifiers.org/kegg.compound/C00114"/>
</rdf:Bag>
</bqbiol:is>
</rdf:Description>
</rdf:RDF>
</annotation>
</species> Even in the |
@edkerk I will check where we make mistakes. As in the excel format we used to correct, the metabolite information is right. |
@edkerk @BenjaSanchez The error exist in the metabolite data file used to update the metabolite annotation, which missed some changes. I will recheck and update the metabolite data file in the github and then you can recheck it again. |
@edkerk @BenjaSanchez After check, I find during the update we should update all the information based on the whole list of metabolites in our model. In our previous update, we only upload the metabolite annotation which has beed corrected. In the previous model, if one metabolite could have both right and wrong annotation, this metabolite information was not included in the uploaded data. As a result the wrong annotation will be kept and not updated. @BenjaSanchez so we should change the way for the metabolite updating. |
@hongzhonglu this is very easily solvable if you just update the table |
@edkerk Just update the metabolite annotation.https://github.com/SysBioChalmers/yeast-GEM/tree/curation/metabolites |
@hongzhonglu are you running the latest (2.0.0) version of RAVEN, and if this does not fix the error message, can you please give the full error message? The yml file is very useful to tracking changes. A quick manual look at the XML file in your latest commit on that branch 59a9e2c seems to indeed fix some of the problems, but it also seems to revert to some other recent changes, such as updated metabolic formulae. It seems like @BenjaSanchez might be on to this as well: #113? |
@edkerk no, that was just a specific conflict that arose on dependencies between @hongzhonglu as @edkerk points out, some changes are being reverted, also the |
@BenjaSanchez |
@hongzhonglu I don't see the difference: shouldn't the data used for the update record all information that changed as well? How can something be updated if it's not changing anything? |
@BenjaSanchez "In our previous update, we only upload the metabolite annotation which has beed corrected. In the previous model, if one metabolite could have both right and wrong annotation, this metabolite information was not included in the uploaded data. As a result the wrong annotation will be kept and not updated. @BenjaSanchez so we should change the way for the metabolite updating." |
@hongzhonglu I understand that, my point is that a metabolite annotation that was wrong before and now has been removed is a change that can be easily included in |
@BenjaSanchez So it is better to just delete metabolite_manual_curation.tsv and keep metabolite_manual_curation_full_list.tsv, which is a fast and safe way to do the update. The change information can be also found in the later file. |
@edkerk As you suggested, I run the RAVEN 2, the error still existed when I run saveYeastModel. The followed is the error information:
Error in saveYeastModel (line 17) I think it is better to make this process simple so that everyone can save it normally. |
@hongzhonglu can you let me know what is in the |
@edkerk This is my version.txt: 2.0.0-rc.2 |
@hongzhonglu This is not version 2.0.0, but a release candidate. Please install the latests release from here, or update your |
@hongzhonglu I see that the new file has 2 extra columns with formulas, are you updating any formulas? This is probably why some formulas are being reverted to the non-SBML compliant format. Also, what is |
@BenjaSanchez Yes. I update formula also this time. I just remember that Feiran has done it before. The general_chebiID means that we can't find the specific ID for the metabolite. |
@edkerk Thanks for your help. Now it works. |
@hongzhonglu sounds then like that information should just be in
|
@BenjaSanchez I agree to update it again. I suggest that we should keep the two columns though we don't use it this time so that we can record all the changes we have. |
@hongzhonglu the problem with those columns is that many of the formulas are not SBML-compliant, so they had to be removed from the model. If you indicate this for the corresponding rows in |
@BenjaSanchez which formula you have changed for SBML-compliant? Can you send me the list? |
@hongzhonglu any that has the characters |
@BenjaSanchez I see. Then I give remark for them. |
update: after including all missing changes from the manual curation (PR #119), a total of 68 warnings for repeated CHEBI ids and 65 warnings for repeated KEGG ids were solved. However, 19 warnings for repeated ids are still present when
|
@BenjaSanchez
For each format of triglyceride, it is difficult to find the related chebiID, so we give the same chebiID for these metabolites. The other repeated chebiID is just like this. |
@hongzhonglu ok for me as long as all cases are like that. Remember to review PR #119 so we can merge changes to |
I'm not so sure, it might be better to then annotate all TAGs with CHEBI:17855, the generic CHEBI for all triacylglycerols. CHEBI:88980 specifies the exact chemical structure in InChI, so this would be incorrect for then most of the metabolites it is annotated to. Using CHEBI:17855 doesn't get rid of the warnings of Instead, include annotations to SwissLipids for precise identification. Note that SwissLipids is also included in MetaNetX. When annotating this, make sure that the location of the desaturation is correct (now I don't remember whether it should be 9Z or 11Z in yeast?). |
@edkerk SwissLipids are a good idea, as MetaNetX ids for metabolites can be stored in the COBRA field |
@edkerk All the repeated chebiID and keggID have been removed now. Only correct IDs were kept in our model now. |
Description of the issue:
There are multiple metabolites with incorrect annotations (CHEBI, KEGG).
Expected feature/value/output:
s_0511
,s_0512
ands_0513
are all choline in different compartments, correctly annotated with CHEBI:15354 and KEGG C00114.In addition,
s_2807
is also annotated with those two CHEBI and KEGG IDs, even though it's (S)-3-hydroxyhexacosanoyl-CoA. Meanwhile,s_0045
, which is the same compound but now located in the peroxisome instead of ER membrane is correctly annotated with CHEBI:52976Current feature/value/output:
The RAVEN function
checkModelStruct
indicates that the following annotations are repeated for metabolites with different names:Reproducing these results:
(note that the current
checkModelStruct
version has a bug that limits the output to the first 10 mistakes)I hereby confirm that I have:
master
branch of the repositoryThe text was updated successfully, but these errors were encountered: