-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simulation explodes with 2 TorchForces #98
Comments
When you say it explodes, what exactly does that mean? What behavior do you observe? Is it possible the MACE model uses some kind of global state that gets incorrectly shared between the models? |
That does suggest some kind of state is getting shared between the two simulations. Try this experiment. First, run two independent simulations at the same time on the same GPU. Each one should be an independent script launched separately. That shouldn't allow any communication between them, so you shouldn't see the artifacts. Now have a single Python script run two simulations at once. First create a single System object. Now use |
Hi @jharrymoore I haven't been able to run your exact script due to some versioning issues of the other packages, but using my own openmm-ml fork with mace in I have been able to reproduce this, and then maybe correct it. The problem occurs for me when the torchscripted models saved and then loaded by # Convert it to TorchScript and save it.
module = torch.jit.script(maceforce)
module.save(filename)
# Create the TorchForce and add it to the System.
force = openmmtorch.TorchForce(filename)
force.setForceGroup(forceGroup)
force.setUsesPeriodicBoundaryConditions(is_periodic)
#force.setOutputsForces(True)
system.addForce(force) If they have different names the minimization appears to work correctly. Here is an example script with both behaviours: To run it you will need this version of openmm-ml: https://github.com/sef43/openmm-ml/tree/mace/examples/mace
from openmm.app import PDBFile, Modeller, PME
from openmm import app
from openmm import unit
import sys
from openmm import LangevinMiddleIntegrator
from openmm.openmm import Platform
from openff.toolkit.topology import Molecule
from ase.io import read
from tempfile import mkstemp
from openmmforcefields.generators import SMIRNOFFTemplateGenerator
from openmmml import MLPotential
for torchscript_filename1,torchscript_filename2, version in zip(['maceforce.pt', 'maceforce1.pt' ],
['maceforce.pt', 'maceforce2.pt'],
['same_name', 'different_name']):
mol1_file = "test_files/butane.sdf"
mol2_file = "test_files/hexane.sdf"
mol1 = Molecule.from_file(mol1_file)
mol2 = Molecule.from_file(mol2_file)
_, tmpfile = mkstemp(suffix=".xyz")
mol1._to_xyz_file(tmpfile)
atoms_mol1 = read(tmpfile)
_, tmpfile = mkstemp(suffix=".xyz")
mol2._to_xyz_file(tmpfile)
atoms_mol2 = read(tmpfile)
modeller = Modeller(mol1.to_topology().to_openmm(), (atoms_mol1.positions+10.0)*unit.angstrom)
modeller_mol2 = Modeller(mol2.to_topology().to_openmm(), atoms_mol2.positions*unit.angstrom)
modeller.add(modeller_mol2.topology, modeller_mol2.positions)
print(modeller.topology)
mol1_indexs=[i for i in range(0,14)]
mol2_indexs=[i for i in range(14,34)]
ff = app.ForceField("amber/protein.ff14SB.xml",
"amber/tip3p_standard.xml",
"amber14/DNA.OL15.xml",)
smirnoff = SMIRNOFFTemplateGenerator(molecules=[mol1, mol2])
ff.registerTemplateGenerator(smirnoff.generator)
modeller.addSolvent(forcefield=ff, padding=1.2 * unit.nanometers)
mm_system = ff.createSystem(modeller.topology,
nonbondedMethod=PME,
nonbondedCutoff=1.0*unit.nanometer,
constraints=None,
)
potential = MLPotential('mace', model_path='test_files/mace-model.model')
# model mol1 with MACE
print("adding mol1 mace force, saving to torchscript file: ", torchscript_filename1)
temp_system = potential.createMixedSystem(modeller.topology, mm_system, mol1_indexs, filename=torchscript_filename1)
# model mol2 with MACE
print("adding mol2 mace force, saving to torchscript file: ", torchscript_filename2)
mm_ml_system = potential.createMixedSystem(modeller.topology, temp_system, mol2_indexs, filename=torchscript_filename2)
# write system to file
with open("test_files/created_system_"+version+".pdb", "w") as f:
PDBFile.writeFile(modeller.topology, modeller.positions, f)
# create simulation
integrator = LangevinMiddleIntegrator(300 * unit.kelvin, 10 / unit.picosecond, 1 * unit.femtosecond)
simulation = app.Simulation(modeller.topology, mm_ml_system, integrator, platform=Platform.getPlatformByName("CUDA"))
simulation.context.setPositions(modeller.positions)
# Comute the potential energy
state = simulation.context.getState(getEnergy=True)
print("initial E = ", state.getPotentialEnergy())
print("Minimising system")
simulation.minimizeEnergy()
print("Minimisation complete")
state = simulation.context.getState(getEnergy=True)
print("Minized E = ", state.getPotentialEnergy())
# write out minimised system
with open("test_files/minimised_system"+version+".pdb", "w") as f:
PDBFile.writeFile(modeller.topology, simulation.context.getState(getPositions=True).getPositions(), f) The output is this:
And the minimized structures look like this: Can you try repeating your test but with different filenames for the intermediate torchscript files? |
Hi Stephen, this has indeed done the trick, thank you very much for you help! |
Great! hopefully this should fix your other issues of running multiple simulations at the same time, I believe the openmmtorch.TorchForce loader just stores the filename of the torchscript file, in then loads the torchscript module when OpenMM makes the context, which happens when you start actually running the simulation, so when another file is saved with the same name in between calling MLPotential and starting the run the first torchscript module will have been overwritten by the second one. Hence mol1 is the one that ends up distorted. |
The will be fixed when #97 is released. |
Hi,
I'm seeing simulations explode when I add two separate TorchForces for different small molecules to the system - specifically the molecule corresponding to the first force explodes in the minimiser. This seems to happen only when I use MACE models for both components - switching to the ani2x potential for one or both TorchForce works fine. The build script for the conda environment is in my fork of openmmtools https://github.com/jharrymoore/openmmtools/blob/dual_topology/mace-install.sh
...and here is the run script that should reproduce the problem
Any insight into what might be going wrong would be greatly appreciated!
Many thanks,
Harry
test_files.zip
The text was updated successfully, but these errors were encountered: