[model]Understanding video with images as in-context #276

kassy11 · 2023-09-18T06:15:58Z

I want to give some images to the model as an in-cotext, then input the video and ask questions about the video content.
(Specifically, I would like to teach the model the type of dogs as images and then have the model count the number of dogs in the video.)

The Otter-image model can be given an image as context, but no video can be input.
And, the Otter-video model cannot be given an image as context, but video can be input.

Is there an optimal implementation method or model for this type of situation?

hcwei13 · 2023-10-26T06:40:47Z

I have the same needs!!! Have you solved it?

king159 added the area:model code of model label Sep 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[model]Understanding video with images as in-context #276

[model]Understanding video with images as in-context #276

kassy11 commented Sep 18, 2023 •

edited

Loading

hcwei13 commented Oct 26, 2023

[model]Understanding video with images as in-context #276

[model]Understanding video with images as in-context #276

Comments

kassy11 commented Sep 18, 2023 • edited Loading

hcwei13 commented Oct 26, 2023

kassy11 commented Sep 18, 2023 •

edited

Loading