Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: subSystems field #11

Closed
BenjaSanchez opened this issue Oct 25, 2017 · 35 comments · Fixed by #321
Closed

fix: subSystems field #11

BenjaSanchez opened this issue Oct 25, 2017 · 35 comments · Fixed by #321
Assignees
Labels
fixed in devel this issue is already fixed in devel and will be closed after the next release

Comments

@BenjaSanchez
Copy link
Contributor

Currently the fields rxnECNumbers & subSystems are retrieved from KEGG & Swissprot, using an automatic script. However, this leads to many cases in which there is more than one match, undesired in the case of subsystems. At some previous version of the Yeast model, this information was indeed present, but someone deleted those fields. @hongzhonglu could you look further into this? The desired data should be available in the original sourceForge repository

@BenjaSanchez
Copy link
Contributor Author

@demilappa maybe you can shed some light here? I remember you told me you managed to do this once, if so which version of Yeast did you use for it?

@demilappa
Copy link
Contributor

@BenjaSanchez I have added the rxnECNumbers & subSystems to other GEMs of bacterial species with semi-automated scripts, not the yeast one. Unfortunately, for multiple EC Numbers per reaction it involved manual input in case the Subsytems were different.

In my case the source of reconstruction was http://kbase.us/metabolic-modeling-in-kbase/. In their Git repository they already had a preliminary mapping among rxns and external DBs, where for the vast majority of the cases they had soecified compartments. As for the EC Numbers, you can still have multiple ones in your structure.

@BenjaSanchez
Copy link
Contributor Author

@demilappa thanks for the info! @hongzhonglu as you mentioned that you checked previous versions of the model with no luck, I will close this issue.

@hongzhonglu
Copy link
Collaborator

@BenjaSanchez, about two weeks ago, I did some mapping according to the reaction name to obtain the subsystem for each reaction. The model used in the mapping are iMM904, iTO980 and the latest human model Recon3D. All the transport reaction are clarified as transport+compartment. So based on such mapping and the present subsystem in the yeast7.6, I hope each reaction can has a unique subsystem, just like the latest human and E.coli GEMs.

@BenjaSanchez BenjaSanchez changed the title regain lost information: EC numbers & subsystems Subsystems field: remove duplicates Oct 26, 2017
@BenjaSanchez BenjaSanchez changed the title Subsystems field: remove duplicates subsystems field: remove duplicates Oct 26, 2017
@BenjaSanchez
Copy link
Contributor Author

@hongzhonglu interesting! how much coverage did you achieve with said mapping? which pathway names are you getting back? from KEGG or another database? I have now re-opened this issue with a changed goal: to eliminate the duplicates in subsystems. Looking forward to your contribution!

@BenjaSanchez BenjaSanchez reopened this Oct 26, 2017
@BenjaSanchez BenjaSanchez removed their assignment Oct 27, 2017
@BenjaSanchez BenjaSanchez changed the title subsystems field: remove duplicates correct subsystems field Jan 31, 2018
@BenjaSanchez BenjaSanchez changed the title correct subsystems field correct subSystems field Feb 18, 2018
@hongzhonglu
Copy link
Collaborator

@BenjaSanchez I have updated the subsystem. You can check it now!

@BenjaSanchez
Copy link
Contributor Author

as discussed in person, a new branch curation/reactions was created for adding these changes

@BenjaSanchez BenjaSanchez changed the title correct subSystems field fix-rxn.annot: subSystems Mar 21, 2018
@BenjaSanchez BenjaSanchez changed the title fix-rxn.annot: subSystems fix: subSystems field Mar 21, 2018
@edkerk
Copy link
Member

edkerk commented Jun 21, 2018

Discussed today:

  • Reactions are allowed to have multiple subsystems
  • Subsystems should primarily be KEGG pathways
  • Subsystems might also be something like "Transport ER to Golgi", "Exchange" and "Pseudoreaction"
  • All reactions (with perhaps some exceptions), should have at least one subsystem
  • If reactions cannot be annotated to any existing (KEGG) pathway, a new subsystem may be defined, but should not be too specific and only include a few genes (rather "Alternative carbon source" than "Arabinose catabolism")
  • Appendices can contain additional information, such as hierarchy of subsystems.

@edkerk
Copy link
Member

edkerk commented Jun 22, 2020

I would like to reignite this issue.

In yeast-GEM, reactions can be annotated to multiple subsystems, and these are predominantly KEGG pathways. Is this desired? This was discussed offline (reported in #11 (comment)), but arguments for this were not recorded.

From the comments above, it seemed there was consensus that single subsystems are preferred, but we ended up with multiple subystems anyway.

A few points to consider:

  • Single subsystems makes it easier to uniquely group reactions together, This can be useful in various scenarios, e.g.
    • If you want to present data/results from the model and you want each reaction present in only one subsystem.
    • If you want to uniquely associate reactions to specific maps (e.g. in MetabolicAtlas).
  • We can easily define our own subSystems if we're not sticking to KEGG pathways. The current subSystems based on KEGG pathways might not always be the most suitable. This also prevents the odd mixture of subSystems starting with a KEGG Pathway ID and those without, while still retaining the KEGG Pathway IDs (see next bulletpoint).
  • Single subSystems does not mean completely discarding the current KEGG pathway annotations. KEGG pathways are on identifiers.org and can therefore be handled as reaction identifiers (we can propose rxnKEGGpathway to be included in COBRA Toolbox?).

Having single subSystems and rxnKEGGpathway annotations is the best of both worlds.

Expected feature/value/output:

Single subSystems, e.g. in the case of r_0103, acetyl-CoA C-acetyltransferase; Fatty acid degradation

Current feature/value/output:

Many reactions have multiple subSystems, e.g. in the case of r_0103, acetyl-CoA C-acetyltransferase; sce00071  Fatty acid degradation;sce00072  Synthesis and degradation of ketone bodies;sce00280  Valine, leucine and isoleucine degradation;sce00310  Lysine degradation;sce00380  Tryptophan metabolism;sce00620  Pyruvate metabolism;sce00630  Glyoxylate and dicarboxylate metabolism;sce00640  Propanoate metabolism;sce00650  Butanoate metabolism;sce00900  Terpenoid backbone biosynthesis;sce01110  Biosynthesis of secondary metabolites;sce01130  Biosynthesis of antibiotics;sce01200  Carbon metabolism;sce01212  Fatty acid metabolism

@hongzhonglu
Copy link
Collaborator

Hi Ed, @edkerk actually, we now already has single subsystem for each rxns. I like your idea to have "both single subSystems and rxnKEGGpathway annotations".

@edkerk
Copy link
Member

edkerk commented Jun 23, 2020

@hongzhonglu, that's great, but where can I find those single subsystems? Both in master and devel reactions still have multiple subsystems.

@hongzhonglu
Copy link
Collaborator

Hi Ed, @edkerk, i don't yet upload them as we did not know how to handle it before.

@BenjaSanchez
Copy link
Contributor Author

Update: I have opened #253 moving the KEGG pathway ids to the proper field. After it is merged, model.subSystems will be ready to get populated with the new group classifications :)

@hongzhonglu
Copy link
Collaborator

Nice! I will try to further update the group classifications.

edkerk added a commit that referenced this issue Jun 15, 2021
- run model=rmfield(model,'subSystems') before running saveYeastModel, until subSystems are added (see #11)
@edkerk
Copy link
Member

edkerk commented Jun 24, 2021

What is the status of this? I was about to make a few curations to Hongzhong's proposed list of subsytems, but then I noticed that MetabolicAtlas also has subsystems defined . @mihai-sysbio should we try to adhere to the MetabolicAtlas maps whenever possible/suitable? Or will these just be re-drawn when we here have settled the subsystems in yeast-GEM?

@edkerk edkerk mentioned this issue Jun 24, 2021
3 tasks
@edkerk
Copy link
Member

edkerk commented Jun 29, 2021

I now noticed the maps on https://github.com/SysBioChalmers/Yeast-maps/, should we try to adhere to these? How have they been defined?

@mihai-sysbio
Copy link
Member

What is the status of this?

This is a bit of a thorny issue. I don't think I can remember all the details accurately (please correct me @edkerk @pecholleyc @hongzhonglu):

  • in the last release yeast-GEM 8.4.2, a reaction could be mapped to multiple subsystems
  • it was desired to switch to one reaction mapped to a single subsystem
  • it looks like the maps have been created based on the subsystems in the yml file in master, ie 8.4.2
  • @hongzhonglu's definition of subsystems has not been merged in devel
  • the subsystem field is missing in the yml file in devel
  • the old subsystems are under the annotation section as kegg.pathway
  • the svg files are mapped to subsystem IDs in this file
  • even if the mapping is not perfect, ie. not all reactions of a subsystem are on the map, and/or not all reactions on the map are part of the subsystem, that's something we can live with I think
  • the Yeast-maps repository is mentioned in the Yeast8 paper
  • the Yeast-maps repository does not contain all the maps, eg not the SVG maps shown on Metabolic Atlas
  • Metabolic Atlas has added support for custom maps, eg for maps that combine parts of different subsystems
  • Metabolic Atlas has added support for a single reaction to be mapped to multiple subsystems, here is an example
  • the maps are time-consuming to redraw - at the moment there is no plan to redraw any maps

@mihai-sysbio should we try to adhere to the MetabolicAtlas maps whenever possible/suitable?

With all of the above in mind my suggestion going forward is to add 1 subsystem for each reaction in the yml file. The next step would then be to decide which SVG is best suited as a map for the respective subsystem.

@mihai-sysbio mihai-sysbio mentioned this issue Jul 1, 2021
3 tasks
@mihai-sysbio
Copy link
Member

I should have mentioned this, but Metabolic Atlas is relying on the subsystem field in the yml populate the reaction-subsystem association. We will likely have to postpone updating the model until this gets sorted out.

@edkerk
Copy link
Member

edkerk commented Jul 1, 2021

I should have mentioned this, but Metabolic Atlas is relying on the subsystem field in the yml populate the reaction-subsystem association. We will likely have to postpone updating the model until this gets sorted out.

But it makes sense that this information is sourced from subsystem in the yml? And this doesn't affect the maps, right, as they are already defined? I'm not sure what changes there should be made for "this gets sorted out", and how this would affect the subsystem definition?

@hongzhonglu
Copy link
Collaborator

Hi all, I am sorry that in past months, i have no time to further curate subsystem used for our model and maps. But I will try to update it in following months.
The maps are drawn based on the subsystems, so if the subsystems were updated, the map will be need to be updated also. @edkerk

@mihai-sysbio
Copy link
Member

@edkerk I'm following up on your questions below.

But it makes sense that this information is sourced from subsystem in the yml?

I guess we can also change how the reaction to subsytem association is parsed from the yml file, but that's how things work like for the rest of the models on Metabolic Atlas.

And this doesn't affect the maps, right, as they are already defined?

Indeed - the SVG maps are already created and they won't be changed. They would need to be updated, as @hongzhonglu says above, but it will take a long time until that happens, so for the near future we should think that they won't be changed.

I'm not sure what changes there should be made for "this gets sorted out", and how this would affect the subsystem definition?

What is expected is that in the yml file for each reaction to have 1 subsystem, for example:

- reactions:
    - !!omap
      - id: "MAR03905"
      ...
      - subsystem:
          - "Glycolysis / Gluconeogenesis"

To my knowledge, this is part of the normal export to yaml that has been recently included in Raven.

@edkerk
Copy link
Member

edkerk commented Jul 5, 2021

But it makes sense that this information is sourced from subsystem in the yml?

I guess we can also change how the reaction to subsytem association is parsed from the yml file, but that's how things work like for the rest of the models on Metabolic Atlas.

My intonation probably didn't come across as I intended, I do think it makes sense to do it that way, so I do no propose any changes.

My confusion was based on whether subsystem assignment is actively used to assign reactions to different maps, so that if subsystems would be changed this would directly alter the maps. But I now understand that this is not the case, the maps were once assigned based on subsystems that were specified in that version of the yeast model. Any changes to the subsystems in new model versions do not automatically translate in changes in maps on Metabolic Atlas (the same reactions will still be linked) but if we'd want to have the maps truly reflect the subsystems then it would have to be done manually.

Alright, so just to make sure I understand it correctly, we should:

  1. Assign 1 subsystem per reaction.
  2. If possible/feasible, follow the existing assignments to Metabolic Atlas maps. It will not break anything at Metabolic Atlas if this is not matched, but trying to adhere to Metabolic Atlas maps would make it easier to modify those maps if desired in the future.

And to see the existing mapping of reactions to Metabolic Atlas maps, one should look at the yml of yeast-GEM 8.4.2, and sort out subsytems without a map (for instance prefixed with # here)?

@hongzhonglu
Copy link
Collaborator

I agree with the two points @edkerk

@mihai-sysbio
Copy link
Member

Very well laid out @edkerk!

modify those maps if desired in the future

On Metabolic Atlas will will look into automatic generation of 2D maps, so perhaps don't put too much weight on the current assignment, as we intend to find a way of always having fully accurate maps.

And to see the existing mapping of reactions to Metabolic Atlas maps, one should look at the yml of yeast-GEM 8.4.2, and sort out subsytems without a map (for instance prefixed with # here)?

That's right. In the case of some existing maps not mapping well with a definition of a subsystem (ie many reactions from other subsystems), there is already support for Custom maps defined in customSVG.tsv. These are maps that will not be matched to a subsystem in the yml file.

@edkerk edkerk added this to the 9.0.0 milestone Dec 10, 2021
@edkerk edkerk removed this from the 9.0.0 milestone Dec 10, 2021
@hongzhonglu
Copy link
Collaborator

@edkerk Hi Ed, now I added miss subsystem information for the remaining rxns. It will be better to combine this into dev version for further update. In the following, the newly added reactions should have a unique subsystem definition.

@edkerk
Copy link
Member

edkerk commented Jan 11, 2022

I rebased it to the devel9 branch, is there a function that should apply these changes?

For the subsystems that are based on KEGG pathways maps, this should be removed from the end of the subsystem name (citrate cycle (tca cycle) ( sce00020 ) should be citrate cycle (tca cycle)), as the KEGG pathway IDs (e.g. sce00020) are already be included as kegg.pathway annotation in the model.

@edkerk edkerk mentioned this issue Jun 16, 2022
3 tasks
@edkerk edkerk added the fixed in devel this issue is already fixed in devel and will be closed after the next release label Jun 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed in devel this issue is already fixed in devel and will be closed after the next release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants