[Bug]: Run error GLM-4-32B-0414

### Your current environment


ERROR 04-16 08:38:03 [core.py:387] RuntimeError: ('Worker failed with error %s, please check the stack trace above for the root cause', 'TypeError <built-in function linear>: linear(): argument \'input\' (position 1) must be Tensor, not tuple\n\nfrom user code:\n   File "/home/cheng/anaconda3/envs/VLLM/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 360, in forward\n    hidden_states, residual = layer(positions, hidden_states, residual)\n  File "/home/cheng/anaconda3/envs/VLLM/lib/python3.12/site-packages/vllm/model_executor/models/glm4.py", line 204, in forward\n    hidden_states = self.mlp(hidden_states)\n  File "/home/cheng/anaconda3/envs/VLLM/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 92, in forward\n    x, _ = self.gate_up_proj(x)\n  File "/home/cheng/anaconda3/envs/VLLM/lib/python3.12/site-packages/vllm/model_executor/layers/linear.py", line 474, in forward\n    output_parallel = self.quant_method.apply(self, input_, bias)\n  File "/home/cheng/anaconda3/envs/VLLM/lib/python3.12/site-packages/vllm/model_executor/layers/linear.py", line 191, in apply\n    return F.linear(x, layer.weight, bias)\n\nSet TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information\n\n\nYou can suppress this exception and fall back to eager by setting:\n    import torch._dynamo\n    torch._dynamo.config.suppress_errors = True\n')
ERROR 04-16 08:38:03 [core.py:387]
CRITICAL 04-16 08:38:03 [core_client.py:359] Got fatal signal from worker processes, shutting down. See stack trace above for root cause issue.
已杀死


### 🐛 Describe the bug


ERROR 04-16 08:38:03 [core.py:387] RuntimeError: ('Worker failed with error %s, please check the stack trace above for the root cause', 'TypeError <built-in function linear>: linear(): argument \'input\' (position 1) must be Tensor, not tuple\n\nfrom user code:\n   File "/home/cheng/anaconda3/envs/VLLM/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 360, in forward\n    hidden_states, residual = layer(positions, hidden_states, residual)\n  File "/home/cheng/anaconda3/envs/VLLM/lib/python3.12/site-packages/vllm/model_executor/models/glm4.py", line 204, in forward\n    hidden_states = self.mlp(hidden_states)\n  File "/home/cheng/anaconda3/envs/VLLM/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 92, in forward\n    x, _ = self.gate_up_proj(x)\n  File "/home/cheng/anaconda3/envs/VLLM/lib/python3.12/site-packages/vllm/model_executor/layers/linear.py", line 474, in forward\n    output_parallel = self.quant_method.apply(self, input_, bias)\n  File "/home/cheng/anaconda3/envs/VLLM/lib/python3.12/site-packages/vllm/model_executor/layers/linear.py", line 191, in apply\n    return F.linear(x, layer.weight, bias)\n\nSet TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information\n\n\nYou can suppress this exception and fall back to eager by setting:\n    import torch._dynamo\n    torch._dynamo.config.suppress_errors = True\n')
ERROR 04-16 08:38:03 [core.py:387]
CRITICAL 04-16 08:38:03 [core_client.py:359] Got fatal signal from worker processes, shutting down. See stack trace above for root cause issue.
已杀死


### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: Run error GLM-4-32B-0414 #16687

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Run error GLM-4-32B-0414 #16687

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions