Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: lipid biomass composition #21

Closed
edkerk opened this issue Oct 27, 2017 · 10 comments · Fixed by #115
Closed

fix: lipid biomass composition #21

edkerk opened this issue Oct 27, 2017 · 10 comments · Fixed by #115
Assignees
Labels
enhancement new field/feature fixed in devel this issue is already fixed in devel and will be closed after the next release

Comments

@edkerk
Copy link
Member

edkerk commented Oct 27, 2017

The expansive description of lipid metabolism is confusing, so I might be wrong in the following. To my understanding, the model doesn't specify a distribution of different FA chain lengths, but demands all FA chains in equal amounts (for majority of lipid metabolism, sterol esters seem specific):

There are all these individual reactions that build up for instance TAG(16:0, 18:1, 18:1).

oleoyl-CoA[erm] + diglyceride (1-16:0, 2-18:1)[erm] => coenzyme A[erm] + triglyceride (1-16:0, 2-18:1, 3-18:1)[erm]

They use specific acyl-CoAs (so no pooled pseudometabolite), and nowhere in those reactions is there any specification on abundant each fatty acid is. With that in mind, it would be cheapest to make TAG(16:0, 16:1, 16:0), and this is what I actually see when I run FBA and minimize the number of fluxes.

The model has so-called 'ISA' reactions that 'converts' FA-chain specific TAG species into a generic TAG species:

triglyceride (1-16:0, 2-18:1, 3-18:1)[erm] => 0.67901 triglyceride[erm]

I don't understand what these coefficients mean, but they seem to connect to the chain length (e.g. triglyceride (1-18:0, 2-18:1, 3-16:0) gets the same coefficient, even though it has different number of saturations).

These generic lipid species are then used in the lipid pseudoreaction:

[...] + 0.000206 fatty acid[c] + [...] + 0.000781 triglyceride[c] + 1.5e-05 zymosterol[c] => lipid[c]

So nowhere along that path is there any specification of distribution of FA chain lengths & saturation, all TAG species are as likely to be made, with some correction for the amount of carbons (but not hydrogens, as the two species mentioned above with similar coefficients do have different molecular weights).

The ISA reactions for fatty acids have no influence in this for two reasons:

palmitate[c] => 0.61538 fatty acid[c]

1: the coefficients are again just representing the number of carbons, not any measured abundance
2: fatty acid[c] is only used in the lipid pseudoreaction to represent free fatty acids, it is not used to be incorporated in any other lipid species.

@edkerk
Copy link
Member Author

edkerk commented Oct 27, 2017

Possible solution:

  • choose a reference condition with measured fatty acid chain length and saturation to be included in the model (can of course be adjusted later, but the repository should be in some reference state).
  • as we typically only know FA profile after hydrolysis, we don't know how the fatty acid chains are distribution over for instance the 3 positions on TAG. So assume that each position is just as likely (or is the middle one always saturated?).
  • with some calculations, adjust the coefficients of all ISA reactions to now represent the measured distribution.

Would be most versatile when implemented in either MATLAB or Excel.

@BenjaSanchez BenjaSanchez added curation question ideas/feedback from other people would be appreciated labels Oct 27, 2017
@BenjaSanchez BenjaSanchez self-assigned this Oct 27, 2017
@edkerk edkerk closed this as completed Oct 31, 2017
@edkerk edkerk reopened this Oct 31, 2017
@edkerk
Copy link
Member Author

edkerk commented Oct 31, 2017

Pushed the wrong button..
Just wanted to add/clarify that we should probably end up with one ISA reaction per lipid species, so:

0.25 TAG(16:0,16:1,16:0) + 0.10 TAG(18:0,16:1,16:0) + [...] = triglyceride

where the coefficients represent measure composition of fatty acids chains, instead of having individual ISA reactions:

TAG(16:0,16:1,16:0) = triglyceride
TAG(18:0,16:1,16:0) = triglyceride

As is the case now.

@hongzhonglu
Copy link
Collaborator

There are total about 176 ISA reactions in present yeast model. These reactions were from yeast 5. In general, it is difficult to understand these reactions as they lack evidences in the database (I am not searching all the reaction database).
@edkerk we are now finding the latest annotation information of each metabolite in yeast model. So based on this, we can correct the coefficients as you suggested.

@BenjaSanchez
Copy link
Contributor

@edkerk as you point out, all these rxns convert chain-specific species to general species. Also, you are correct in the observation that the stoich. coeff. that these rxns get is based on chain length. The relationship is actually linear; as an example, I will focus on triglycerides (even though it should equally apply to all other species). Here are the 4 possible stoich. coeff. for all 32 triglyceride ISA rxns, based on total chain length from all three F.A. tails:

image

So the more carbon the higher the assigned stoich. coeff. The idea then currently in Yeast7 is to allow the model to choose any triglyceride from the 32 options (through the ISA rxns), and to correct for the chain length, so it is "equally attractive" carbon-wise for the cell to produce any of them. However as you say, we should adjust these coefficients to account for saturations as well, so making them proportional to the molecular weight would solve the issue; in that sense it's good that @hongzhonglu is working on including chemical formulas for all these species.

That being said, I think there are still mistakes in how the total abundance for each lipid is calculated: We know that we can get the abundance of each species in the model if we take the stoich. coeff. that is in the lipid pseudo-rxn, because the stoich. coeff. of the species lipid is = 1 in the biomass pseudo-rxn. For instance, for triglycerides this is = 0.000781 mmol/gDW. However, even if we would use the "cheapest" TAG (16:16:16) to produce the totality of this species, we would need a total of:

0.000781/0.62963 = 0.0012 mmol/gDW

of that specific TAG (as 0.62963 is the stoich. coeff. assigned for 16:16:16 species in the ISA rxns). However, we know from literature that the amount of TAGs in a cell is around 0.007 mmol/gDW, so the TAG composition is largely underestimated. In order to simplify this mess, I would instead redefine directly the triglyceride abundance in the lipid rxn to 0.007 mmol/gDW, and assume that this corresponds to a specific TAG distribution and work back from there. As an example, if we use 16:16:16 species (= 48 carbons) as a baseline, then the coeff in the ISA rxns for any of those would be = 1, and then we can re-scale the rest. For instance, any 16:16:18 species would get a stoich. coeff. of:

50/48 = 1.0417

This of course should be done with molecular weights as stated before to improve precision, but the idea is the same. I hope this clears out some of the confusion.

@BenjaSanchez
Copy link
Contributor

Answering to some other comments in the discussion:

@edkerk

we should probably end up with one ISA reaction per lipid species, so:
0.25 TAG(16:0,16:1,16:0) + 0.10 TAG(18:0,16:1,16:0) + [...] = triglyceride

Here I am not so sure. The TAG distribution can vary considerably between strains, so I think it might be safer to just leave it up to the modeler if he/she has specific data, but if not just allow the model to choose any TAG, making of course the corrections that I mentioned in my previous post. Or at the very least, let's first solve the mistakes in the composition, and maybe then we can try out to force the model to specific TAG distributions. How well are these distributions studied btw? For TAGs probably well enough, but for phospholipids?

@hongzhonglu

it is difficult to understand these reactions as they lack evidences in the database

Note that ISA rxns are actually pseudo-rxns, so they are not expected to appear in any database. Let me know what do you think about the solution that I presented in the previous post :)

@BenjaSanchez
Copy link
Contributor

BenjaSanchez commented Nov 3, 2017

Finally, to add some more to the discussion, here is a breakdown for all lipids created through the 176 ISA rxns:

Compound In lipid pseudo-rxn? # ISA rxns that can create it
complex sphingolipid yes 3 in Golgi +3 in mitochondrion
dolichol no 9 in lipid particle
inositol-P-ceramide yes* 10 in Golgi + 10 in ER + 10 in mitochondrion
inositol phosphomannosylinositol phosphoceramide yes* 10 in Golgi + 10 in ER + 10 in mitochondrion
mannosylinositol phosphorylceramide yes* 10 in Golgi + 10 in ER + 10 in mitochondrion
1-phosphatidyl-1D-myo-inositol yes 8 in cytoplasm
ergosterol ester yes 2 in ER membrane
fatty acid yes 5 in cytoplasm
phosphatidyl-L-serine yes 8 in ER membrane
phosphatidylcholine yes 8 in ER membrane
phosphatidylethanolamine yes 8 in ER membrane
triglyceride yes 32 in ER membrane

*partially: The only pseudo-metabolites that go into the lipid pseudo-rxn are the 3 in Golgi, as they get pooled in the complex sphingolipid pseudo-metabolite through isa rxns:
inositol-P-ceramide [Golgi] -> complex sphingolipid [Golgi]
inositol phosphomannosylinositol phosphoceramide [Golgi] -> complex sphingolipid [Golgi]
mannosylinositol phosphorylceramide [Golgi] -> complex sphingolipid [Golgi]

And later transported to the cytoplasm (where they are used in the lipid pseudo-rxn):
complex sphingolipid [Golgi] -> complex sphingolipid [cytoplasm]

For the case of mitochondrion, even though equivalent isa rxns also exist, there is no transport to the cytoplasm, therefore those 3 pseudo-metabolites are dead-end metabolites

For the case of ER, there're no isa rxns to begin with, therefore all 30 pseudo-metabolites are dead-end as well.

@BenjaSanchez BenjaSanchez changed the title Confirm/correct lipid biomass composition Correct lipid biomass composition Nov 3, 2017
@BenjaSanchez
Copy link
Contributor

@edkerk maybe this is something to fix? should complex sphingolipids be only produced in Golgi or can they also be produced in mitochondrion and ER?

@edkerk
Copy link
Member Author

edkerk commented Nov 21, 2017

@BenjaSanchez I assume you're referring to the sphingolipids. There are two issues here:

  • Localization: if there is no proof in literature that they are produced in mitochondrion and/or ER, these pathways should be deleted, or, if it is known to what amount they are produced in each compartment they should be connected to lipid pseudoreaction. My gut feeling tells me that this is not known, so we will likely end up just removing these dead ends.

  • Stoichiometry / logical 'OR' (I think this is the trickier problem): the 'isa' reactions allow alternative sphingolipids to be labelled 'complex sphingolipid'. Have a look at Figure 1 of the Yeast 5.0 paper where they discuss 'isa' reactions. They specify

A model user is free to constrain the fluxes which produce specific complex sphingolipids to model an observed lipid composition, or may leave the model unconstrained if the more general biomass definition is sufficient for their needs.

  • but this is a very bad solution, as you'd have to adjust these boundaries for every slight change in growth rate. For most of the lipids we will have detailed information, so we can truly specify the lipid component of biomass. But can we find itemized quantities of complex sphingolipids (mmol/gDCW)?

One strength of these ISA reactions is that gene essentiality simulations will have less false positives, as the cell will have the choice to make different (complex sphingo)lipids, which apparently is the case in reality. So, instead of deleting ISA reactions, perhaps we should leave them in but set boundaries to 0. If one wants to do gene essentiality simulations, one has to switch those reactions on.

@BenjaSanchez
Copy link
Contributor

@edkerk thanks for your feedback :)

  • Regarding localization:
    For now I will leave them in the Golgi then.

  • Regarding stoichiometry:
    I agree with you on this. The 3 rxns I showed above were actually the case for the sphingolipids:

inositol-P-ceramide [Golgi] -> complex sphingolipid [Golgi]
inositol phosphomannosylinositol phosphoceramide [Golgi] -> complex sphingolipid [Golgi]
mannosylinositol phosphorylceramide [Golgi] -> complex sphingolipid [Golgi]

However, in literature data (PJ Lahtvee et al. 2016) we don't see sphingolipids at all, so I'm not sure what to do with them for now. Keep the original abundance values? Or remove them entirely?

@BenjaSanchez BenjaSanchez changed the title Correct lipid biomass composition correct lipid biomass composition Jan 31, 2018
@BenjaSanchez BenjaSanchez changed the title correct lipid biomass composition fix: lipid biomass composition Mar 21, 2018
@BenjaSanchez BenjaSanchez removed the question ideas/feedback from other people would be appreciated label Apr 12, 2018
@BenjaSanchez BenjaSanchez mentioned this issue May 20, 2018
3 tasks
@BenjaSanchez BenjaSanchez added the enhancement new field/feature label May 21, 2018
@BenjaSanchez
Copy link
Contributor

BenjaSanchez commented May 21, 2018

update: PR #112 fixes this issue by using the newly defined SLIME reactions: lipids are now split into their 2 basic components, backbone and acyl-chains:

lipid -> sB backbone + sC1 acyl-chain1 + sC2 acyl-chain2 + ...

With this, separate lipid pseudoreactions are defined later for backbones and for acyl-chains. The stoichiometric coefficients are representing molecular weights, as the data used comes in g/gDW.

More info is found on SysBioChalmers/SLIMEr. This issue will be closed when the changes are merged to master.

@BenjaSanchez BenjaSanchez added the fixed in devel this issue is already fixed in devel and will be closed after the next release label May 29, 2018
BenjaSanchez pushed a commit that referenced this issue Sep 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new field/feature fixed in devel this issue is already fixed in devel and will be closed after the next release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants