Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: chemical formulas field #19

Closed
BenjaSanchez opened this issue Oct 27, 2017 · 8 comments · Fixed by #115
Closed

fix: chemical formulas field #19

BenjaSanchez opened this issue Oct 27, 2017 · 8 comments · Fixed by #115
Assignees
Labels
fixed in devel this issue is already fixed in devel and will be closed after the next release format fix things associated to format of any of the model/data/script files

Comments

@BenjaSanchez
Copy link
Contributor

When using http://sbml.org/validator/ on the ecYeast7 sbml file, 50 errors appear of the sort:

Error (SBML Validation Rule #fbc-20303): The value of attribute 'fbc:chemicalFormula' on the SBML object must be set to a string consisting only of atomic names or user defined compounds and their occurrence. Reference: L3V1 Fbc V2 Section 3.4 Encountered '(' when expecting a capital letter.

so these formulas should be fixed

@BenjaSanchez BenjaSanchez added the format fix things associated to format of any of the model/data/script files label Oct 27, 2017
@BenjaSanchez BenjaSanchez self-assigned this Oct 27, 2017
@simas232
Copy link

Well, this doesn't seem to be a severe problem, since libSBML consider it as a warning, not as a fatal error... We are still able to I/O model without any information loss anyway.

The brackets are not compatible with fbc:chemicalFormula field, that's why there are these 50 warnings. I think that this is a problem of libSBML itself, because if brackets are uncompatible with fbc:chemicalFormula field, then OutputSBML is supposed to detect and fix this, when model is being exported, but it does not.

@BenjaSanchez
Copy link
Contributor Author

BenjaSanchez commented Oct 27, 2017

@simas232 yes I agree as you say not severe at all, I can indeed still load the model to Matlab without any loses. It should still eventually be fixed as it causes problems when loading to Python, according to collaborators. I tried changing the formulas to use (), [] and {}, but none of them work. I will therefore keep them as they were originally with (), close this issue and hope that libSBML fixes this problem soon. Should we contact them somehow?

@edkerk
Copy link
Member

edkerk commented Oct 27, 2017

This is only an issue with metabolites that are (a) polymers with unspecified chain-length (glucan, chitin, etc.), or (b) tRNAs for which the exact chemical formula is unknown (depending on exact sequence of tRNA). In the chemical formula, the repeating subunit is placed in between brackets, followed by n. In such cases, the software won't be able to check whether the reaction is chemically balanced, as n is not specified.

So I doubt that libSBML will change this. I would argue that the brackets and n can be removed from glycan etc., because the metabolites in the model really do just represent only one subunit. For tRNA, either specify the chemical formula based on the organism/strain specific sequence (I suppose this is done in ME models?), or represent the tRNA part as R.

@BenjaSanchez
Copy link
Contributor Author

@edkerk thanks for your input! @hongzhonglu is working on curating metabolite information, and I'm sure this will be useful for him ;)

@BenjaSanchez
Copy link
Contributor Author

update: in curation/metabolites all formulas for tRNA's were replaced with an R, as suggested by @edkerk, as it is not clear the exact composition of them. In order to maintain mass balances of reactions, all tRNA(aa) formulas were replaced with the corresponding aa formula + R. This brought down the amount of errors to 18 using http://sbml.org/validator/

@BenjaSanchez BenjaSanchez changed the title fix chemical formulas field fix: chemical formulas field Mar 21, 2018
@willigott
Copy link

I agree with @edkerk: The parentheses (and the n) can probably removed in most cases and would then also allow to check for elemental balances. Currently there are 18 species in the model that show this issue:

{'s_0001__91__ce__93__': '(C6H10O5)n',
 's_0002__91__c__93__': '(C6H10O5)n',
 's_0003__91__e__93__': '(C6H10O5)n',
 's_0004__91__ce__93__': 'C12H22O11(C6H10O5)n',
 's_0332__91__er__93__': '(GlcN)1 (Ino(acyl)-P)1 (Man)3 (EtN)2 (P)2',
 's_0414__91__g__93__': 'C38H68N2O27P2(C5H8)n',
 's_0443__91__c__93__': 'C32H58N2O22P2(C5H8)n',
 's_0444__91__g__93__': 'C32H58N2O22P2(C5H8)n',
 's_0509__91__c__93__': 'H2O(C8H13NO5)n',
 's_0510__91__ce__93__': 'H2O(C6H12NO4)n',
 's_0642__91__c__93__': 'C20H36O(C5H8)n',
 's_0643__91__lp__93__': 'C20H36O(C5H8)n',
 's_0644__91__er__93__': 'C26H47O9P(C5H8)n',
 's_0645__91__c__93__': 'C20H37O4P(C5H8)n',
 's_0646__91__er__93__': 'C20H37O4P(C5H8)n',
 's_1098__91__m__93__': 'C12H18N2O4S2R2(C2H2NOR)n',
 's_1184__91__c__93__': 'C36H64N2O17P2(C5H8)n',
 's_1309__91__e__93__': 'H2O(C6H8O6)n'}

If we look at "s_0001__91__ce__93__", it appears e.g. in reaction 'r_0005' with the substrates

['s_1543__91__c__93__']

and products

['s_0794__91__c__93__', 's_1538__91__c__93__', 's_0001__91__ce__93__']

If we now check chemical formulas of those species we find:

'C15H22N2O17P2'

and

'H', 'C9H11N2O12P2', '(C6H10O5)n'

One can easily see that this reaction is only balanced for n=1, so one can also just remove n and the parentheses from the chemical formula. I guess the same holds for most other compounds listed above.

Exception is probably '(GlcN)1 (Ino(acyl)-P)1 (Man)3 (EtN)2 (P)2' which was just pulled from KEGG I guess.

@BenjaSanchez BenjaSanchez mentioned this issue May 17, 2018
3 tasks
@BenjaSanchez
Copy link
Contributor Author

Thanks @willigott for the input. I've created PR #111, which solves all 18 problems. Looking forward to anyone's additional comments on it :)

@BenjaSanchez
Copy link
Contributor Author

PR #111 has been merged: the model in devel is now valid SBML. This issue will be closed on the next release. Thank you all!

@BenjaSanchez BenjaSanchez added the fixed in devel this issue is already fixed in devel and will be closed after the next release label May 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed in devel this issue is already fixed in devel and will be closed after the next release format fix things associated to format of any of the model/data/script files
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants