CodeGen Converter by michaelfeil · Pull Request #229 · vllm-project/vllm

michaelfeil · 2023-06-24T08:10:05Z

This PR aims to integrate CodeGen. Work in progress, not ready.

…ion (vllm-project#203)

zhuohan123 · 2023-06-26T15:10:24Z

Thank you for your contribution! Please let us know when this PR is ready for review!

Co-authored-by: neubig <neubig@gmail.com>

zhuohan123 · 2023-07-25T21:20:40Z

@michaelfeil Are you still working on the CodeGen model support? Do you need any help from our side?

michaelfeil · 2023-07-26T20:21:37Z

I'll try to get back to you in the coming days!

WoosukKwon · 2023-12-12T18:31:42Z

Closed as this PR is too old.

This fixes a very silly issue where mismatching values of `warmup_mode` flag could cause graph recompilations and eventually memory leaks.

@MengqingCao

This PR added pooling support for vllm-ascend Tested with `bge-base-en-v1.5` by encode: ``` from vllm import LLM # Sample prompts. prompts = [ "Hello, my name is", "The president of the United States is", "The capital of France is", "The future of AI is", ] # Create an LLM. model = LLM(model="./bge-base-en-v1.5", enforce_eager=True) # Generate embedding. The output is a list of EmbeddingRequestOutputs. outputs = model.encode(prompts) # Print the outputs. for output in outputs: print(output.outputs.embedding) # list of 4096 floats ``` Tested by embedding: ``` from vllm import LLM, SamplingParams llm = LLM(model="./bge-base-en-v1.5", task="embed") (output,) = llm.embed("Hello, my name is") embeds = output.outputs.embedding print(f"Embeddings: {embeds!r} (size={len(embeds)})") ``` Related: vllm-project/vllm-ascend#200 ## Known issue The accuracy is not correct since this feature rely on `enc-dec` support. It'll be done in the following PR by @MengqingCao Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

…uired (vllm-project#232) Extend vLLM v0 by adding automatic compat sampling fallbacks to device sampling. This offers substantial performance benefits when not all requests require advanced features like structured outputs. always_compat_sampling should now never be set in production - whenever something otherwise unsupported, such as structured outputs non-greedy sampling for models that don't support it on device non-uniform top-k top-p on host or with a model that doesn't support it is detected, the system will switch to host compat sampling. This also means that we never override the temperature, p or k to force it to be uniform across batch. Also contains a fix for vllm-project#229 (it will fallback to compat sampling if non-uniform is not supported on device or device sampling is disabled).

Matching the upstream Dockerfile as of v0.9.0: https://github.com/vllm-project/vllm/blob/v0.9.0/docker/Dockerfile.rocm?plain=1#L124

michaelfeil and others added 14 commits June 22, 2023 18:07

add GPTBigCode

2872bcb

add suggested changes

b5aeb74

reorder inputs

947e1e5

reorder imports

0fd6d9b

reorder imports

54a5939

reorder imports

e0400f4

add colon

f57f2b2

Remove a redundant space

be23063

commit gptj

f3f5e7d

add codegen

1d55fcc

[Bugfix] Fix a bug in RequestOutput.finished (vllm-project#202)

a1d4d9f

[Fix] Better error message when there is OOM during cache initializat…

df2f758

…ion (vllm-project#203)

Bump up version to 0.1.1 (vllm-project#204)

04e1a31

[Docs] Add GPTBigCode to supported models (vllm-project#213)

b9456eb

michaelfeil and others added 15 commits July 1, 2023 12:51

GPTBigCode (StarCoder, SantaCoder Support) (vllm-project#209)

4e19d44

fix wrong using getattr to get dict value (vllm-project#232)

af2c09e

Update README.md (vllm-project#236)

993408e

Compatible with Decapoda Research llama hf version (vllm-project#251)

14e0136

[Bug] Fix the OOM condition for CPU cache (vllm-project#260)

4dac6b5

[Doc] Documentation for distributed inference (vllm-project#261)

b04f5cb

[BugFix] Fix a bug in counting running sequences (vllm-project#266)

2419f77

[Fix] Fix default port number in benchmark scripts (vllm-project#265)

fbb7c29

expand coverage of gpt2 model loading (vllm-project#271)

270931b

Update setup.py (vllm-project#282)

6b9fd47

Co-authored-by: neubig <neubig@gmail.com>

Add LLM.set_tokenizer (vllm-project#283)

04e0c93

[Tokenizer] Add an option to specify tokenizer (vllm-project#284)

f8206d5

remove floats == 0 comparison (vllm-project#285)

c0734b5

[Tokenizer] Add tokenizer mode (vllm-project#298)

5673e14

Update README.md (vllm-project#306)

3de69a9

Michaelvll and others added 7 commits July 1, 2023 12:51

Add news for the vllm+skypilot example (vllm-project#314)

6692256

[Fix] Do not pin memory when in WSL (vllm-project#312)

e0afe06

[Fix] Weight loading for GPTBigCode (vllm-project#313)

4a88944

Raise error for long prompt (vllm-project#273)

183792a

changes to codegen

fbb5b17

temporarily add helper functions

90f4eb2

update helper func

c11357a

michaelfeil marked this pull request as draft August 3, 2023 21:39

zhuohan123 force-pushed the main branch from 3affdce to 0080d83 Compare August 30, 2023 09:26

zhuohan123 added the new-model Requests to new models label Sep 12, 2023

WoosukKwon closed this Dec 12, 2023

yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024

Fix nightly benchmark scripts (vllm-project#229)

3a31485

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CodeGen Converter#229

CodeGen Converter#229
michaelfeil wants to merge 36 commits intovllm-project:mainfrom
michaelfeil:gpt_j_convert

michaelfeil commented Jun 24, 2023 •

edited

Loading

Uh oh!

zhuohan123 commented Jun 26, 2023 •

edited

Loading

Uh oh!

zhuohan123 commented Jul 25, 2023

Uh oh!

michaelfeil commented Jul 26, 2023

Uh oh!

WoosukKwon commented Dec 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants

Uh oh!

Conversation

michaelfeil commented Jun 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhuohan123 commented Jun 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhuohan123 commented Jul 25, 2023

Uh oh!

michaelfeil commented Jul 26, 2023

Uh oh!

WoosukKwon commented Dec 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants

michaelfeil commented Jun 24, 2023 •

edited

Loading

zhuohan123 commented Jun 26, 2023 •

edited

Loading