From a06fe9f15c237f5b725d25e0017d0bde6b5261ed Mon Sep 17 00:00:00 2001
From: Jonathan Tow <41410219+jon-tow@users.noreply.github.com>
Date: Tue, 24 Jan 2023 23:30:26 -0500
Subject: [PATCH] Update stale comment from results table (#222)

* Remove stale comment from results table

* Add details
---
 examples/summarize_rlhf/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/summarize_rlhf/README.md b/examples/summarize_rlhf/README.md
index 7f0dfb8f1..49369e0f2 100644
--- a/examples/summarize_rlhf/README.md
+++ b/examples/summarize_rlhf/README.md
@@ -40,7 +40,7 @@ For an in-depth description of the example, please refer to our [blog post](http
 
 ### Results
 
-On 1,000 samples from CNN/DailyMail test dataset:
+The following tables display ROUGE and reward scores on the test set of the TL;DR dataset between SFT and PPO models.
 
 1. SFT vs PPO