-
Notifications
You must be signed in to change notification settings - Fork 857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use multiple GPU from a node #411
Comments
i had to configure the integration.py and the integration_engine.py to detect two gpus. Terflow detects them now but trying to figure out how to have it displayed when you run exo. |
Hi @jorge123255 - Could you please help me with the steps to fix and what needs to be added or edited. I tried to find the files by the way, but couldn't find those files too. Thank you, |
/exo/interence inference_engine.py class InferenceEngine(ABC):
def get_available_gpus() -> List[int]: def get_inference_engine(inference_engine_name: str, shard_downloader: 'ShardDownloader'):
and under /exo/tinygrad inference.py from pathlib import Path Tensor.no_grad = True default settingsTEMPERATURE = float(os.getenv("TEMPERATURE", 0.85)) def build_transformer(model_path: Path, shard: Shard, model_size="8B", devices=None):
class TinygradDynamicShardInferenceEngine(InferenceEngine):
|
@jorge123255 - Thanks for the update. I tried and couldn't get it going as I am beginner in this. If I can get the file, it would be of great help. |
@udupicloud here i uploaded the changes on my github |
After modifying the code according to your method, I encountered the following error: |
Aaa let me fix that |
Is it solved now? |
Sorry I will get to it soon I had a death int he family unexpected.
…On Nov 24, 2024 at 7:45 PM -0600, svm87601 ***@***.***>, wrote:
Is it solved now?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
I'm sorry to hear this and believe that everything will be fine |
I am running the setup on Ubuntu 22.04 with python 3.12 and all the Nvidia drivers including the coda 12.4 has been installed. Installed llama and models 3.2. each ML machine has 2 (RTX 3070 and 3090) GPU's
The text was updated successfully, but these errors were encountered: