Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could I do multi-thread evaluation? #161

Open
Hodge931 opened this issue Jul 21, 2024 · 4 comments
Open

Could I do multi-thread evaluation? #161

Hodge931 opened this issue Jul 21, 2024 · 4 comments

Comments

@Hodge931
Copy link

To speed up the evaluation, I would like to evaluate, say 64 examples in parallel with multiple threads. Does this affect the correctness of the evaluation? Thanks a lot!

@shuyanzhou
Copy link
Contributor

That may affect the results. The reason is that we deliberately design the order of examples so that former examples won't affect later examples.

This is the script for 4 parallel runs.
You can also reset the environment more frequently to avoid the inter-example influence.

@Hodge931
Copy link
Author

Thanks a lot for the reply!

  1. In my understanding, with the reset environment, the evaluation of each example is correct. Therefore, I may set up two AWS instances, and evaluate, say examples 1-406 with instance 1, and examples 407-812 with instance 2. Is such evaluation correct?
  2. Sometimes errors may happen in the middle. For example, if the evaluation of the 10th example breaks down, could I just continue to evaluate the 11th example without re-evaluating the first 10 examples and without resetting environments?

Your kind suggestions are highly appreciated!

@dryingpaint
Copy link

Hello! Do you mind elaborating on how the earlier tasks are dependent on later tasks? Is there any way to launch separate sites for each new task that we're evaluating so that we can run multiple agents at the same time? How often should the environment resets be happening? Thanks for you help :)

@leoozy
Copy link

leoozy commented Aug 23, 2024

Hello, do you have any advise on how to set up multiple dockers for the same website. For example, we can set up 10 shoping weisite with different port. So we can parallel evaluate it. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants