feature/docker_improvements by Callum17 · Pull Request #4768 · oobabooga/textgen

Callum17 · 2023-11-29T19:24:39Z

Checklist:

I have read the Contributing guidelines.

This is a the cleanup requested in:
#4144 (comment)

Changes

dropped venv
smaller / simpler Dockerfile definitions
ability to set app uid and gid permissions (handy for cloud deployments)

known issues
i. Some of the core dependencies have conflicts with extensions dependencies, so I've made extensions a configurable build arg BUILD_EXTENSIONS
These issues should be fixed downstream in the extensions.
Known culprits from extensions (there may be more):

coqui_tts

ii.
GPTQ-for-Llama is broken by dependency upgrades. Discovered there were conflicts at build time. Resolved the build conflicts by loosening the GPTQ-for-Llama package requirements.txt constraints, but it looks like there were breaking changes that cause failure at inference time.

The Dockerfile effectively applies the following changes:
oobabooga/GPTQ-for-LLaMa@cuda...Callum17:GPTQ-for-LLaMa:bugfix/text-generation-webui-dependency-conflicts

But it was a naive fix attempt.
Running a GPTQ model with actorder and group_size yielded the following error:

 File "/home/app/text-generation-webui/modules/ui_model_menu.py", line 209, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File "/home/app/text-generation-webui/modules/models.py", line 85, in load_model
output = load_func_map[loader](model_name)
File "/home/app/text-generation-webui/modules/models.py", line 336, in GPTQ_loader
model = modules.GPTQ_loader.load_quantized(model_name)
File "/home/app/text-generation-webui/modules/GPTQ_loader.py", line 141, in load_quantized
model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize, pre_layer)
File "/home/app/.local/lib/python3.10/site-packages/gptq_for_llama/gptq_old/llama_inference_offload.py", line 236, in load_quant
model.load_state_dict(safe_load(checkpoint))
File "/home/app/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM:
    Unexpected key(s) in state_dict: "model.layers.0.self_attn.k_proj.g_idx", "model.layers.0.self_attn.o_proj.g_idx",

@oobabooga do we want to try and fix the issues with GPTQ-for-Llama fork? Or are we planning to drop support for it? In which case I'll remove the references to it in this PR.
I'm not actually sure if GPTQ-for-Llama works in the current docker image build.

Testing
Otherwise this Docker build seems fine. Successfully tested inference with Transformers, GGUF, AutoGPTQ, Exllama, Exllamav2.

…vrionments.

…docker_container

oobabooga · 2023-11-29T19:48:57Z

@Callum17 thanks for the new PR! GPTQ-for-LLaMa is a legacy loader that is only maintained because it is the only way to run GPTQ models on Pascal cards. Earlier this year there were additional compilation requirements to install it, but now it is simply a CUDA 12.1 wheel in the requirements.txt. If there is any special treatment to GPTQ-for-LLaMa in the dockerfile, like cloning a repository or trying to compile it, it should be removed.

About extensions requirements: in the one-click installer, they are installed before the web UI requirements. This way, the web UI takes precendence and will not break due to an extension. I think that's the best way to do it.

If these two are already okay, let me know and I'll merge the PR.

…luded in core app requirements.txt

Callum17 · 2023-11-29T21:53:06Z

but now it is simply a CUDA 12.1 wheel in the requirements.txt

That simplifies things :)
Added a commit to drop additional Docker build steps for GPTQ-for-LLaMa.

I did do a rebuild and try inference again with GPTQ-for-LLaMa.
Specifically with https://huggingface.co/TheBloke/LLaMa-7B-GPTQ -b gptq-4bit-32g-actorder_True
Failed with the same error as before. Think the issue is not so much the binaries but rather interface changes in the other packages.

Do we expect GPTQ-for-LLaMa to be working?

This way, the web UI takes precendence and will not break due to an extension. I think that's the best way to do it.

The Dockerfile build will simply fail and give you a list of conflicts if you try to install any incompatible extensions via the optional BUILD_EXTENSIONS ARG. Probably better to fail at build time rather than potentially failing at runtime where it is harder to catch.

mongolu · 2023-11-30T05:08:08Z

@Callum17 , I have a ❓
Why not using one-click in Docker file ?
It makes things a lot more easier.

oobabooga · 2023-11-30T05:20:08Z

Specifically with https://huggingface.co/TheBloke/LLaMa-7B-GPTQ -b gptq-4bit-32g-actorder_True
Failed with the same error as before. Think the issue is not so much the binaries but rather interface changes in the other packages.

That's expected. GPTQ-for-LLaMa doesn't work with models that use both groupsize and actorder like this one.

Since the changes have been tested, let's merge the PR. I appreciate the help with improving the Dockerfile -- it was stuck in March 2023 before this PR.

Callum17 added 12 commits November 29, 2023 12:30

Add wheel based nvidia Dockerfile that play more nicely with cloud en…

86cbf5e

…vrionments.

Merge remote-tracking branch 'origin/dev' into bugfix/3943-exllamav2_…

d52a3e6

…docker_container

Clean up ignore files. Move .dockerignore to top..

61ade86

Remove legacy Dockerfile and duplicate .dockerignore.

5f24ef2

Fix docker-compose mount volume paths.

52afd80

Fix extensions install

c812836

Fix extensions build paths. Update docs for docker usage.

9e700d2

Remove deprecated comments.

bd16def

Fix merge conflict

680411c

fix constraint

ef59144

fix constraint

f459f4f

Add note about GPTQ-for-Llama breaking change

587564d

Callum17 mentioned this pull request Nov 29, 2023

Bugfix/3943 docker container improvements #4144

Closed

Drop additional build steps for GPTQ-forLlama as binaries are now inc…

5d13ae7

…luded in core app requirements.txt

oobabooga merged commit 88620c6 into oobabooga:dev Nov 30, 2023

Penagwin mentioned this pull request Dec 12, 2023

Updated Docker Docs #4900

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature/docker_improvements#4768

feature/docker_improvements#4768
oobabooga merged 13 commits into
oobabooga:devfrom
Callum17:feature/docker_improvements

Callum17 commented Nov 29, 2023 •

edited

Loading

Uh oh!

oobabooga commented Nov 29, 2023

Uh oh!

Callum17 commented Nov 29, 2023 •

edited

Loading

Uh oh!

mongolu commented Nov 30, 2023

Uh oh!

oobabooga commented Nov 30, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Callum17 commented Nov 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist:

Uh oh!

oobabooga commented Nov 29, 2023

Uh oh!

Callum17 commented Nov 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mongolu commented Nov 30, 2023

Uh oh!

oobabooga commented Nov 30, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Callum17 commented Nov 29, 2023 •

edited

Loading

Callum17 commented Nov 29, 2023 •

edited

Loading