Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix LISI warning error #302

Merged
merged 1 commit into from
Apr 29, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 15 additions & 16 deletions scib/metrics/lisi.py
Original file line number Diff line number Diff line change
Expand Up @@ -430,43 +430,42 @@ def compute_simpson_index_graph(
simpson = np.zeros(len(chunk_ids))

# loop over all cells in chunk
for i in enumerate(chunk_ids):
for i, chunk_id in enumerate(chunk_ids):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed notation for better readability

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good idea!

# get neighbors and distances
# read line i from indices matrix
get_col = indices[i[1]]
get_col = indices[chunk_id]

if get_col.isnull().sum() > 0:
# not enough neighbors
print(i[1] + " has not enough neighbors.")
simpson[i[0]] = 1 # np.nan #set nan for testing
print(f'Chunk {chunk_id} does not have enough neighbors. Skipping...')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugfix here. F-strings used to avoid type mismatch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, better style!

simpson[i] = 1 # np.nan #set nan for testing
continue
else:
knn_idx = get_col.astype('int') - 1 # get 0-based indexing

knn_idx = get_col.astype('int') - 1 # get 0-based indexing

# read line i from distances matrix
D_act = distances[i[1]].values.astype('float')
D_act = distances[chunk_id].values.astype('float')

# start lisi estimation
beta = 1
# negative infinity
betamin = -np.inf
# positive infinity
betamax = np.inf

H, P = Hbeta(D_act, beta)
Hdiff = H - logU
tries = 0

# first get neighbor probabilities
while (np.logical_and(np.abs(Hdiff) > tol, tries < 50)):
if (Hdiff > 0):
while np.logical_and(np.abs(Hdiff) > tol, tries < 50):
if Hdiff > 0:
betamin = beta
if (betamax == np.inf):
if betamax == np.inf:
beta *= 2
else:
beta = (beta + betamax) / 2
else:
betamax = beta
if (betamin == -np.inf):
if betamin == -np.inf:
mbuttner marked this conversation as resolved.
Show resolved Hide resolved
beta /= 2
else:
beta = (beta + betamin) / 2
Expand All @@ -475,14 +474,14 @@ def compute_simpson_index_graph(
Hdiff = H - logU
tries += 1

if (H == 0):
simpson[i[0]] = -1
if H == 0:
simpson[i] = -1
continue
# then compute Simpson's Index
batch = batch_labels[knn_idx]
B = convert_to_one_hot(batch, n_batches)
sumP = np.matmul(P, B) # sum P per batch
simpson[i[0]] = np.dot(sumP, sumP) # sum squares
simpson[i] = np.dot(sumP, sumP) # sum squares

return simpson

Expand Down