-
Notifications
You must be signed in to change notification settings - Fork 692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Integrate video reading #883
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ZackYule ! Left some initial comments in below
- Update .gitignore to include new download configuration files and temporary video directory - Refactor test_video_function.py to use VideoDownloaderToolkit class and update test cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! @Wendong-Fan . During the testing process, I found that our test folder was named |
…nto_chunks parameter and updated the
I see, thanks @ZackYule |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ZackYule 's update, some comments below, please also solve conflicts with master branch
camel/toolkits/video_toolkit.py
Outdated
print( | ||
f'''Warning: cookies.txt file not found at path | ||
{cookies_path}.''' | ||
) | ||
self._cookies_path = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we can return the warning message directly for agent using
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Wendong-Fan! This is a good idea. However, when we download certain videos, cookies may not be necessary. So this is just a warning and does not terminate the program.
- Added calculate_file_hash and file_exists_and_is_identical functions to utils/__init__.py - Added test_video_screenshots_download function to test_video_function.py
|
||
print(f"Trying to download video from: {video_url}") | ||
try: | ||
downloader = VideoDownloaderToolkit(video_url=video_url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that we need to understand how to support multi modal capability. The toolkits are defined for the agent. I think that the ideal way is the agent could figure out the steps by itself, instead of that developers call the toolkit functions to process the data and pass the final results to the agent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now if we want to make it fully automatic we need to define another agent to handle multi-modal information, I think we can finish the functionality part first and add further support if possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the agent currently has multiple input channels, I think we should first manage the selection of these input channels in the next step.
|
||
print(f"Trying to download video from: {video_url}") | ||
try: | ||
downloader = VideoDownloaderToolkit(video_url=video_url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now if we want to make it fully automatic we need to define another agent to handle multi-modal information, I think we can finish the functionality part first and add further support if possible
…r and update usage in web_video_object_recognition.py, web_video_description_extractor.py, and test_video_function.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
camel/toolkits/video_toolkit.py
Outdated
self.chunk_duration = chunk_duration | ||
self.yt_dlp = importlib.import_module('yt_dlp') | ||
self._chunk_durations: list[int] = [] | ||
self._cookies_path: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cookies path only used in def cookies_path, I feel it's not necessary to set this as global variable
camel/toolkits/video_toolkit.py
Outdated
download_directory or tempfile.mkdtemp() | ||
).resolve() | ||
|
||
print(f"self._download_directory: {self._download_directory}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unify by using logger.info
?
camel/toolkits/video_toolkit.py
Outdated
# Download the video and get the filename | ||
info = ydl.extract_info(url, download=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add logger info here
camel/toolkits/video_toolkit.py
Outdated
info = ydl.extract_info(url, download=True) | ||
return ydl.prepare_filename(info) | ||
except yt_dlp.utils.DownloadError as e: | ||
raise RuntimeError(f"Failed to download video: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise RuntimeError(f"Failed to download video: {e}") | |
raise RuntimeError(f"Failed to download video from {url}: {e}") |
camel/toolkits/video_toolkit.py
Outdated
if isinstance(timestamps, int): | ||
interval = video_length // (timestamps + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add check for timestamps <= 0
camel/toolkits/video_toolkit.py
Outdated
|
||
if isinstance(timestamps, int): | ||
interval = video_length // (timestamps + 1) | ||
tss = [int((i + 1) * interval) for i in range(timestamps)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tts could be 0 if timestamps is very small
camel/toolkits/video_toolkit.py
Outdated
r"""Returns a list of OpenAIFunction objects representing the | ||
functions in the toolkit. | ||
|
||
Returns: | ||
List[OpenAIFunction]: A list of OpenAIFunction objects representing | ||
the functions in the toolkit. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r"""Returns a list of OpenAIFunction objects representing the | |
functions in the toolkit. | |
Returns: | |
List[OpenAIFunction]: A list of OpenAIFunction objects representing | |
the functions in the toolkit. | |
""" | |
r"""Returns a list of FunctionTool objects representing the | |
functions in the toolkit. | |
Returns: | |
List[FunctionTool]: A list of FunctionTool objects representing | |
the functions in the toolkit. | |
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to be updated and verified
Description
Download the video using yt-dlp and provide it to the multimodal agent for analysis. Two download options are available: full download and block download.
Motivation and Context
close #744
Types of changes
What types of changes does your code introduce? Put an
x
in all the boxes that apply:Implemented Tasks
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!