-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix python version in setup for numba errors #106
Conversation
Thanks for putting this together, this would be our first external contribution! I think since numba is our only 3.12 blocker, we should actually make numba optional and jit the functions on demand, if it's available. Do you think you could do that instead? |
Hi @ddkang, I could see |
aidb/vector_database/tasti.py
Outdated
dists = np.sqrt(np.sum((x - embeddings[i]) ** 2)) | ||
if dists < min_dists[i]: | ||
min_dists[i] = dists | ||
_get_and_update_dists_no_numba(x, embeddings, min_dists) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you can just define the function and move the shared code out of the try/except block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ddkang , moved shared code outside, let me know if you need any changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant you can simply share the function name and call the function outside of the try/except block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean like this? -
def get_and_update_dists(x: np.ndarray, embeddings: np.ndarray, min_dists: np.ndarray):
'''
:param x: embedding of cluster representatives
:param embeddings: embeddings of all data
:min_dists: array to record the minimum distance for each embedding to the embedding of cluster representatives
'''
try:
from numba import njit, prange
@njit(parallel=True)
def _get_and_update_dists(x: np.ndarray, embeddings: np.ndarray, min_dists: np.ndarray):
for i in prange(len(embeddings)):
_get_and_update_dists_shared(x, i, min_dists, embeddings)
except:
def _get_and_update_dists(x: np.ndarray, embeddings: np.ndarray, min_dists: np.ndarray):
for i in range(len(embeddings)):
_get_and_update_dists_shared(x, i, min_dists, embeddings)
_get_and_update_dists(x, embeddings, min_dists)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, pushed the changes.
@ttt-77 can you review these changes? |
aidb/vector_database/tasti.py
Outdated
from typing import Optional | ||
|
||
from aidb.vector_database.vector_database_config import TastiConfig | ||
|
||
@njit(parallel=True) | ||
def _get_and_update_dists_shared(x, i, min_dists, embeddings): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to move it into function 'get_and_update_dists'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay @ttt-77 , moved it.
The code logic looks good to me. Could you test it with numba and without numba if possible? |
Sure. One thing I wanted to ask, as |
Can we make the requirement optional? Like aidb[all] vs aidb[base] or something |
@ddkang Sorry for a late reply, cant we just add |
Sure sounds good |
But I think |
@ddkang Do you also have a slack or discord channel for aidb, I am planning to add a new vectorDB to aidb(written code already), so I may need to ask doubts there. |
Hi @sky-2002 could you email me for the slack invite link? Thank you for helping with the development! Sounds good regarding the temporary solution. |
Try pip install ai-db |
Description: As per the issue given here, numba requires python version <3.12 and >=3.8. According to the PEP 440 version specifiers, I have added
<3.12
in thesetup
function insetup.py
file.Note: The installation will still have the issue, but now the setup will have python version requirements so users can change versions accordingly.