How to use the mm-cot frame as a utility library through local LLM? #73

dszpr · 2024-02-05T01:52:02Z

Hi! Much appreciated for the excellent work!

I am working on vision-QA task using BLIP2, which consists of three modules:
ViT that extracting vision feature
QFORMER that narrow the gap between vision and language modalities
T5xxl that receive the question and the output of QFORMER to generate answers.

I wonder if it's possible to employ the mm-cot as a utility library in BLIP2 model to enhance vision-QA inference?

cooelf · 2024-05-19T06:30:42Z

Hi, thanks for your interest! An efficient way could be training your framework just in two steps like MM-CoT: (i) rationale generation; (ii) answer inference; no matter the backbone modules are.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use the mm-cot frame as a utility library through local LLM? #73

How to use the mm-cot frame as a utility library through local LLM? #73

dszpr commented Feb 5, 2024

cooelf commented May 19, 2024

How to use the mm-cot frame as a utility library through local LLM? #73

How to use the mm-cot frame as a utility library through local LLM? #73

Comments

dszpr commented Feb 5, 2024

cooelf commented May 19, 2024