Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The residue_index_map in closeContactGNMAnalysis has an error in selecting "name CA" #4924

Open
BHM-Bob opened this issue Feb 21, 2025 · 2 comments

Comments

@BHM-Bob
Copy link

BHM-Bob commented Feb 21, 2025

Expected behavior

The residue_index_map should be consistent with the output of the neighbour_generator, which returns the order of atoms selected. So when select is "name CA", residue_index_map selects the atoms and expands to all atoms in the residue where they belong, but neighbour_generator returns [n_sele_atom, ], while residue_index_map should be [n_res_atoms, ], which does not match.

Actual behavior

when select is "name CA", residue_index_map selects the atoms and expands to all atoms in the residue where they belong, but neighbour_generator returns [n_sele_atom, ], while residue_index_map should be [n_res_atoms, ], which does not match.

It only matches when you select "protein"!

Code to reproduce the behavior

the source code of the closeContactGNMAnalysis

    def __init__(self,
                 universe,
                 select='protein',
                 cutoff=4.5,
                 ReportVector=None,
                 weights="size"):
        super(closeContactGNMAnalysis, self).__init__(universe,
                                                      select,
                                                      cutoff,
                                                      ReportVector)
        self.weights = weights

    def generate_kirchoff(self):
        nresidues = self.ca.n_residues
        positions = self.ca.positions
        residue_index_map = [
            resnum
            for [resnum, residue] in enumerate(self.ca.residues)
            for atom in residue.atoms
        ]

the docs use select "name CA" which leads a wrong residue_index_map

Current version of MDAnalysis

2.8.0

@orbeckst
Copy link
Member

Could you provide a minimal example (ideally with some of the examples from the tests, see

from MDAnalysisTests.datafiles import GRO, XTC
) that demonstrates the erroneous behavior?

I will also say that the GNM code is some of the most ancient analysis code in MDAnalysis and has not seen a lot of maintenance over the years. If you (or someone else) can contribute a way to fix it then that would be very welcome. Otherwise if issues with the code are not being addressed (because developers only have limited amount of time) we may decide to deprecate it and remove in 3.0.

@BHM-Bob
Copy link
Author

BHM-Bob commented Mar 12, 2025

I can't provide a test case because I think there do not have a RIGHT result. However, I can give a demo code to prove that it indeed has a BUG in closeContactGNMAnalysis 's code.

import numpy as np
from MDAnalysis import Universe
from MDAnalysisTests.datafiles import GRO,XTC

u = Universe(GRO,XTC)

gnm = closeContactGNMAnalysis(u, select='name CA', weights='size')
matrix = gnm.generate_kirchoff()
n_residue = matrix.shape[0]
n_accessible_residue = np.array([np.isin(n_residue, res_ids) for res_ids in gnm.ca.residues.ids]).argmax()
print(n_accessible_residue, np.abs(matrix[n_accessible_residue+1:]).sum())

gnm = closeContactGNMAnalysis(u, select='protein', weights='size')
matrix = gnm.generate_kirchoff()
print(np.abs(matrix[n_accessible_residue+1:]).sum())

It will output:

14 0.0
6894.190527601791

Where 14 is the max residue the closeContactGNMAnalysis can use with select 'name CA', so that you can see that, after residue 14, the matrix ONLY has ZERO score. However, it do not make sense as the 'protein' selection has 6894.190527601791 score after residue 14.

If you want to make a fix, I can give my fix code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants