Skip to content

feat: Allow passing arbitrary Tensorizer serialization and deserialization kwargs; update docs#7

Merged
sangstar merged 24 commits intosangstar/tensorizer-aws-update-and-any-kwargsfrom
sangstar/update-tensorizer-frontend-and-docs
Jun 12, 2025
Merged

feat: Allow passing arbitrary Tensorizer serialization and deserialization kwargs; update docs#7
sangstar merged 24 commits intosangstar/tensorizer-aws-update-and-any-kwargsfrom
sangstar/update-tensorizer-frontend-and-docs

Conversation

@sangstar
Copy link
Collaborator

@sangstar sangstar commented May 30, 2025

Feature adding the following to the sangstar/tensorizer-aws-update-and-any-kwargs branch, which aims to add:

  • Exposing arbitrary kwargs to Tensorizer from vLLM
  • Harmonizing Tensorizer's AWS-style boto3 support with vLLM's
  • Updating the docs to reflect these changes, greatly expanding upon the intended usage pattern so users can read it and immediately know how to get started

This is the kwargs-update portion. This PR adds:

  • Exposing arbitrary kwargs to TensorSerializer and TensorDeserializer, adding tests to confirm functionality
  • Update docs thoroughly to reflect straight-forward usage, including the new changes. The docs are not finished yet, as they'll be updated with the AWS-update feature to be included in the eventual PR for the aforementioned branch.

The changes herein are succinctly described in a snippet to the updated docs, describing how to load a model using Tensorizer with the updated arbitrary kwargs without too much pain in constructing the JSON string. The kwargs here are just throwaways demonstrating the support.

#!/bin/sh

MODEL_LOADER_EXTRA_CONFIG='{
  "tensorizer_uri": "s3://my-bucket/vllm/facebook/opt-125m/v1/model.tensors",
  "stream_kwargs": {"force_http": false},
  "deserialization_kwargs": {"verify_hash": true, "num_readers": 8}
}'

vllm serve facebook/opt-125m \
  --load-format=tensorizer \
  --model-loader-extra-config="$MODEL_LOADER_EXTRA_CONFIG"

Please hold off on any requests for changes relating to formatting. This will all be done in a later stage with vLLM's formatter.

sangstar added 13 commits May 28, 2025 16:42
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
…` and add to test

Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
…serializer`

Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
@sangstar sangstar requested review from Eta0 and arsenetar May 30, 2025 15:17
@sangstar sangstar changed the title Sangstar/update tensorizer frontend and docs feat: Allow passing arbitrary Tensorizer serialization and deserialization kwargs; update docs May 30, 2025
@sangstar sangstar self-assigned this May 30, 2025
@sangstar sangstar requested a review from wbrown June 3, 2025 16:09
sangstar added 2 commits June 3, 2025 15:18
Adjusts the regex string in `arg_utils.parse_type` to allow
for newlines within the JSON string

Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Comment on lines -275 to -279
self.deserializer_params = {
"verify_hash": self.verify_hash,
"encryption": self.encryption_keyfile,
"num_readers": self.num_readers
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure that these three attributes are accounted for somewhere in self.deserializer_kwargs instead, even if they're deprecated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I did with the handling of the stream params, I'll bring this code snippet back and have it be overwritten by self.deserialization_kwargs if available, like:

        self.deserializer_params = {
            "verify_hash": self.verify_hash,
            "encryption": self.encryption_keyfile,
            "num_readers": self.num_readers
        }

        self.deserializer_params.update(**self.deserialization_kwargs)

sangstar and others added 5 commits June 10, 2025 11:10
Co-authored-by: Eta <24918963+Eta0@users.noreply.github.com>
Apply the batch of commits suggested in this review.

Co-authored-by: Eta <24918963+Eta0@users.noreply.github.com>
Also fixes the logic for parsing different permutations for
using the example script based on whether args are passed to
CLI args directly or packaged in
--model-loader-extra-config

Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
@sangstar sangstar requested a review from Eta0 June 10, 2025 19:26
Co-authored-by: Eta <24918963+Eta0@users.noreply.github.com>
sangstar and others added 3 commits June 11, 2025 10:29
Co-authored-by: Eta <24918963+Eta0@users.noreply.github.com>
Co-authored-by: Eta <24918963+Eta0@users.noreply.github.com>
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Copy link

@Eta0 Eta0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good. Just update the PR description with the simplified shell script, please.

@sangstar sangstar merged commit e85685e into sangstar/tensorizer-aws-update-and-any-kwargs Jun 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants