-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add data collector for dataset generation #1193
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on comment below add one more commit: 09b9d89
free feel to leave your comments
ori_update_memory = agent.update_memory | ||
|
||
def update_memory( | ||
message: BaseMessage, role: OpenAIBackendRole |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
role could be got from BaseMessage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Wendong, the new update_memory
needs to ensure consistency with the parameters of the old one. Additionally, using basemessage
's role
will set messages of type function_call
to role=assistant
, which may lead to issues in distinguishing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Xukun, maybe we can check whether tool_calls
exits in the BaseMessage
, if it exits then we can set role type = tool
camel/data_collector/base.py
Outdated
role: OpenAIBackendRole, | ||
name: Optional[str] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these 2 information could get from BaseMessage
camel/data_collector/base.py
Outdated
if len(message.msgs) != 1: | ||
raise ValueError("Only supports one message in response") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could also support multi msg
] = defaultdict(list) | ||
self._recording = False | ||
self.agents: List[Tuple[str, BaseAgent]] = [] | ||
self._id = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dynamic id, one history set one uuid
if len(self.agents) > 1: | ||
raise ValueError("AlpacaDataCollector only supports one agent") | ||
if isinstance(agent, list): | ||
if len(agent) != 1: | ||
raise ValueError("AlpacaDataCollector only supports one agent") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on current design I think we can further support list of agents and multiple msg, could we add this support with this PR?
Description
add data collector for dataset generation
This is only a prototype!
Motivation and Context
Why is this change required? What problem does it solve?
If it fixes an open issue, please link to the issue here.
You can use the syntax
close #15213
if this solves the issue #15213Types of changes
What types of changes does your code introduce? Put an
x
in all the boxes that apply:Implemented Tasks
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!