Skip to content

Non-deterministic thermo for polycyclics #1027

@nyee

Description

@nyee

As referenced in #638, #639, the getSmallestSetOfSmallestRings has non-deterministic results which will often result in non-deterministic thermo with the new polycyclic heuristic.

If you try running this code-snippet:

from rmgpy.data.rmg import RMGDatabase
from rmgpy import settings
from rmgpy.species import Species

smiles = "CC1CC12C1C3CC1C32"
spec_in = Species().fromSMILES(smiles)

database = RMGDatabase()

database.load(settings['database.directory'], thermoLibraries=[],\
             kineticsFamilies='none', kineticsDepositories='none', reactionLibraries = [])
thermoDatabase = database.thermo

print("When molecule created multiple times")
for i in range(5):
    spec_in = Species().fromSMILES(smiles)
    thermo_gav = thermoDatabase.getThermoDataFromGroups(spec_in)
    print(thermo_gav.comment)

print("\n\n")
    
print("When molecule created once:")
spec_in = Species().fromSMILES(smiles)
for i in range(5):
    thermo_gav = thermoDatabase.getThermoDataFromGroups(spec_in)
    print(thermo_gav.comment)

The first loop can give two different results (differences highlighter)

Thermo group additivity estimation: group(Cs-CsCsCsCs) + other(R) + group(Cs-CsCsCsH) + other(R) + group(Cs-CsCsCsH) + other(R) + group(Cs-CsCsCsH) + other(R) + group(Cs-CsCsCsH) + other(R) + group(Cs-CsCsCsH) + other(R) + group(Cs-CsCsHH) + other(R) + group(Cs-CsCsHH) + other(R) + group(Cs-CsHHH) + other(R) + polycyclic(s1_3_4_ane) + polycyclic(s2_4_4_ane) + polycyclic(s2_4_4_ane) + polycyclic(s3_4_4_ane) - ring(Cyclobutane) - ring(Cyclobutane) - ring(Cyclobutane) - ring(Cyclobutane)

Thermo group additivity estimation: group(Cs-CsCsCsCs) + other(R) + group(Cs-CsCsCsH) + other(R) + group(Cs-CsCsCsH) + other(R) + group(Cs-CsCsCsH) + other(R) + group(Cs-CsCsCsH) + other(R) + group(Cs-CsCsCsH) + other(R) + group(Cs-CsCsHH) + other(R) + group(Cs-CsCsHH) + other(R) + group(Cs-CsHHH) + other(R) + polycyclic(s3_4_4_ane) + polycyclic(s2_4_4_ane) + polycyclic(s3_4_4_ane) + polycyclic(s1_3_4_ane) - ring(Cyclobutane) - ring(Cyclobutane) - ring(Cyclobutane) - ring(Cyclobutane)

On the other hand, the second loop will pick one of the two above and always give the same one. This implies that if a molecule object is already created, then getSSSR and polycyclic thermo will give the same result for that object. However, if a new object is created, then do not expect to get the same results. I also tried the same code but replacing fromSMILES() with fromAdjacencyList() and got the same results, so it is not dependent on what method is used to create the molecule.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions