Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VATEX data process: Do we need to split VATEX video into clips? #20

Open
fightingaaa opened this issue Nov 8, 2023 · 5 comments
Open

Comments

@fightingaaa
Copy link

fightingaaa commented Nov 8, 2023

Hi~ Thanks for share your great work.
I see annotations like below:

{'id': 'VATEX_zkbnKBewRLA_000069_000079', 'v_id': 'zkbnKBewRLA_000069_000079', 'video': 'v_zkbnKBewRLA.mp4', 'source': 'VATEX', 'conversations': [{'from': 'human', 'value': '

Do we need to split VATEX video into clips? For example, cut the video v_zkbnKBewRLA.mp4 into zkbnKBewRLA_000069_000079 .

@yeliudev
Copy link

Same question.

@RupertLuo
Copy link
Owner

RupertLuo commented Feb 19, 2024 via email

@yeliudev
Copy link

yes

------------------ 原始邮件 ------------------ 发件人: "RupertLuo/Valley" @.>; 发送时间: 2024年2月19日(星期一) 下午4:39 @.>; @.>; 主题: Re: [RupertLuo/Valley] VATEX data process: Do we need to split VATEX video into clips? (Issue #20) Same question. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.>

Many thanks for your reply! I also noticed that VATEX annotations in Valley_instruct_73k.json were repeated twice (but jukin annotations were not). I was wondering why using such a setting?

@RupertLuo
Copy link
Owner

This is a bug since I process the data at first, the actual number of valley_instruct is less than 73k, I will put a notice Readme, and I will fix this bug ASAP. You can mix the three file bellew to get the right data.
image

@yeliudev
Copy link

I see. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants