Has this been tested on anything? #86

JulesGM · 2022-11-10T06:02:55Z

JulesGM
Nov 10, 2022

Has this been tested on anything?

LouisCastricato · 2022-11-10T12:25:17Z

LouisCastricato
Nov 10, 2022

Yep! We have an example we'll be merging soon where we got openai's learning to summarize reward model working with TRLX on a 20b language model. We also have a very minimal version of CodeRL working, it's included as an example here.

We've also been discussing TRLX with plenty of RLHF industry folks and have gotten a few seals of approval at this point.

0 replies

panyi121 · 2024-01-17T06:08:37Z

panyi121
Jan 17, 2024

What's the largest PPO model size that has been trained and tested with TRLX? Can you share some performance metrics, i.e. GPU count, training time?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Has this been tested on anything? #86

{{title}}

Replies: 2 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Has this been tested on anything? #86

JulesGM Nov 10, 2022

Replies: 2 comments

LouisCastricato Nov 10, 2022

panyi121 Jan 17, 2024

JulesGM
Nov 10, 2022

LouisCastricato
Nov 10, 2022

panyi121
Jan 17, 2024