Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rknn-llm 是否可以为算子指定使用的后端 #225

Open
wohaiaini opened this issue Mar 18, 2025 · 6 comments
Open

rknn-llm 是否可以为算子指定使用的后端 #225

wohaiaini opened this issue Mar 18, 2025 · 6 comments

Comments

@wohaiaini
Copy link

大模型推理时,可以自己指定哪些算子使用CPU、哪些算子使用NPU吗,还是说都内部确定好的?谢谢

@waydong
Copy link
Collaborator

waydong commented Mar 19, 2025

你好,目前不支持。

@wohaiaini
Copy link
Author

谢谢。再请教下,rkllm有没有办法获取中间层的输出结果,比如打印注意力计算得分,softmax之后的概率值张量

@waydong
Copy link
Collaborator

waydong commented Mar 20, 2025

谢谢。再请教下,rkllm有没有办法获取中间层的输出结果,比如打印注意力计算得分,softmax之后的概率值张量

目前只能返回LAST_HIDDEN_LAYER

@wohaiaini
Copy link
Author

wohaiaini commented Mar 26, 2025

谢谢。再请教下,rkllm有没有办法获取中间层的输出结果,比如打印注意力计算得分,softmax之后的概率值张量

目前只能返回LAST_HIDDEN_LAYER

这个结果怎么使用,有demo吗? @waydong
我自己写了个demo,计算logits和softmax后输出预测结果,感觉结果有点乱,如下:

)

a<|1|im =>
are the bot assistant, You ||im...|> You you|im_3|> is_ <|im_end|>|im_1|>user<<

@wohaiaini wohaiaini reopened this Mar 26, 2025
@waydong
Copy link
Collaborator

waydong commented Mar 26, 2025

@wohaiaini
Copy link
Author

是的,我就是设置的这个参数来获取的last_hidden_layer.bin文件,然后使用C++写的后续计算logits和softmax,以及预测token输出的函数,就是结果的输出不是很合理,看起来有点语无伦次。
rkllm_infer_params.mode = RKLLM_INFER_GET_LAST_HIDDEN_LAYER;

不知道哪里有问题,有没有特别要注意或者实现的地方?
可有关于对这个bin使用的 C++的demo ?
或者我把这个代码发出来瞅瞅?
(last_hidden_layer.bin lm_head.bin 都打印了张量值进行了对比,个人理解这些初始数据应该是没问题的)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants