[Bugfix] Adds outlines performance improvement by lynkz-matt-psaltis · Pull Request #5053 · vllm-project/vllm

lynkz-matt-psaltis · 2024-05-26T02:16:13Z

Borrows outlines upgrade from #4109 against Guide and detects state resets to clear cache

Peng-YM · 2024-06-24T03:40:33Z

vllm/model_executor/guided_decoding/outlines_logits_processors.py

+            raise TypeError(f"Unsupported instruction type {type(instruction)}")
+
+        # Retrieve allowed tokens from cache using the current state
+        cacheKey = instruction.id


It seems to me that instruction has no attribute named id

Yes, with outlines==0.0.46 there doesn't seem to be and id attribute. I'm testing with

cacheKey = hash(tuple(allowed_tokens))

instead

maxdebayser · 2024-07-04T19:30:15Z

vllm/model_executor/guided_decoding/outlines_logits_processors.py

+            # Cache miss, calculate allowed tokens and cache them
+
+            np_allowed_tokens = np.array(allowed_tokens, dtype=np.int32)
+            allowed_tokens_tensor = torch.from_numpy(np_allowed_tokens).pin_memory()


I'm doing some of my testing with the cpu backend. I don't know if there is a better way to test for the availability of pin_memory() but I'm running it with:

allowed_tokens_tensor = torch.from_numpy(np_allowed_tokens) try: allowed_tokens_tensor = allowed_tokens_tensor.pin_memory() except NotImplementedError: pass

maxdebayser · 2024-07-04T19:50:48Z

I'm testing this change with this request:

curl http://localhost:8000/v1/completions   -H "Content-Type: application/json"   -d '{
    "model": "facebook/opt-125m",
    "prompt": ["Here is an example of a JSON document representing a user record: "],
    "max_tokens": 1000,
    "min_tokens": 900,
    "temperature": 0,
    "guided_decoding_backend": "outlines",
    "response_format": {"type":"json_object"},
    "frequency_penalty": 2.0,
    "presence_penalty": 2.0
  }'

I'm using a very small model on purpose so that a potential performance improvement can stand out more against the time it takes to generate to run the model forward().

Excluding the first call and averaging the next 3 calls the result is 22s with the baseline and the changes in this PR, so I'm not seeing a big difference. But perhaps this is a case that doesn't benefit from the caching in this PR. FYI, this test was run on a 80GB A100.

mergify · 2025-01-16T19:13:14Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @lynkz-matt-psaltis.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

hmellor · 2025-02-18T16:06:32Z

Closing as this PR is quite old and conflicts with main. If you want to continue working in this PR (or open a new one) please feel free to do so!

lynkz-matt-psaltis force-pushed the feature/outlines-latest-perf branch 2 times, most recently from af2c735 to 93481e2 Compare June 8, 2024 06:55

Use reference equality

d866130

lynkz-matt-psaltis force-pushed the feature/outlines-latest-perf branch from 93481e2 to d866130 Compare June 8, 2024 06:59

Matt Psaltis added 4 commits June 8, 2024 17:26

Cache off instruction id

f856b59

Fixes

8c4b10a

line profiler

19da049

Fix

2148511

njhill mentioned this pull request Jun 13, 2024

[Bugfix] Adds outlines performance improvement #5006

Closed

Peng-YM reviewed Jun 24, 2024

View reviewed changes

maxdebayser reviewed Jul 4, 2024

View reviewed changes

russellb added the structured-output label Jan 16, 2025

mergify bot added the needs-rebase label Jan 16, 2025

hmellor closed this Feb 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Adds outlines performance improvement#5053

[Bugfix] Adds outlines performance improvement#5053
lynkz-matt-psaltis wants to merge 5 commits intovllm-project:mainfrom
lynkz-matt-psaltis:feature/outlines-latest-perf

lynkz-matt-psaltis commented May 26, 2024

Uh oh!

Peng-YM Jun 24, 2024

Uh oh!

maxdebayser Jul 4, 2024

Uh oh!

maxdebayser Jul 4, 2024

Uh oh!

maxdebayser commented Jul 4, 2024

Uh oh!

mergify bot commented Jan 16, 2025

Uh oh!

hmellor commented Feb 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

lynkz-matt-psaltis commented May 26, 2024

Uh oh!

Peng-YM Jun 24, 2024

Choose a reason for hiding this comment

Uh oh!

maxdebayser Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

maxdebayser Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

maxdebayser commented Jul 4, 2024

Uh oh!

mergify bot commented Jan 16, 2025

Uh oh!

hmellor commented Feb 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants