-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Using the MLC Python library currently outputs quite a bit of extra information:
[21:56:11] /Users/catalyst/Workspace/mlc-ai-package-self-runner/_work/package/package/tvm/src/runtime/metal/metal_device_api.mm:167: Intializing Metal device 0, name=Apple M2 Max
System automatically detected device: metal
Using model folder: /Users/simon/Library/Application Support/io.datasette.llm/mlc/dist/prebuilt/mlc-chat-Llama-2-7b-chat-hf-q4f16_1
Using mlc chat config: /Users/simon/Library/Application Support/io.datasette.llm/mlc/dist/prebuilt/mlc-chat-Llama-2-7b-chat-hf-q4f16_1/mlc-chat-config.json
Using library model: /Users/simon/Library/Application Support/io.datasette.llm/mlc/dist/prebuilt/lib/Llama-2-7b-chat-hf-q4f16_1-metal.so
Since I'm building my own CLI tool for executing prompts against models I needed a way to suppress this.
I ended up implementing a pretty convoluted set of hacks. First, to disable the print() statements in the mlc_chat.chat_module module:
import mlc_chat.chat_module
def noop(*args, **kwargs):
pass
mlc_chat.chat_module.__dict__["print"] = noopBut the first line - [21:56:11] /Users/catalyst/Workspace/mlc-ai-package... - wasn't disabled by that. So I had to use a really ugly hack from some code I wrote against llama-cpp-python:
class SuppressOutput:
def __enter__(self):
# Save a copy of the current file descriptors for stdout and stderr
self.stdout_fd = os.dup(1)
self.stderr_fd = os.dup(2)
# Open a file to /dev/null
self.devnull_fd = os.open(os.devnull, os.O_WRONLY)
# Replace stdout and stderr with /dev/null
os.dup2(self.devnull_fd, 1)
os.dup2(self.devnull_fd, 2)
# Writes to sys.stdout and sys.stderr should still work
self.original_stdout = sys.stdout
self.original_stderr = sys.stderr
sys.stdout = os.fdopen(self.stdout_fd, "w")
sys.stderr = os.fdopen(self.stderr_fd, "w")
def __exit__(self, exc_type, exc_val, exc_tb):
# Restore stdout and stderr to their original state
os.dup2(self.stdout_fd, 1)
os.dup2(self.stderr_fd, 2)
# Close the saved copies of the original stdout and stderr file descriptors
os.close(self.stdout_fd)
os.close(self.stderr_fd)
# Close the file descriptor for /dev/null
os.close(self.devnull_fd)
# Restore sys.stdout and sys.stderr
sys.stdout = self.original_stdout
sys.stderr = self.original_stderr
with SuppressOutput():
mod = ChatModule(model="my-model")
output = chat_mod.generate(prompt="my prompt")My full code is here: https://github.com/simonw/llm-mlc/blob/df522d602de524a81c2d6ba1ef35a89f6da41f8b/llm_mlc.py
Earlier notes on this in this issue:
It would be great if this was easier! Posting it here because of this request for feedback on API usability: