Conversation
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Important Files Changed
|
📝 WalkthroughWalkthroughThe changes modify dataset preparation, testing infrastructure, and evaluation configurations. Key updates include: removing editable installs for BFCL, replacing local dataset downloads with HuggingFace dataset loading for SciCode, adding deterministic seed to virtual environment creation, updating metric ranges and structure for eval tests, and removing BFCL evaluation from the super_49b test suite. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: Igor Gitman <igitman@nvidia.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com>
gwarmstrong
left a comment
There was a problem hiding this comment.
just a couple comments/questions
Signed-off-by: Igor Gitman <igitman@nvidia.com>
| with open(test_aai_file, "w", encoding="utf-8") as test_aai_fout: | ||
| for hf_split, output_split in split_mapping.items(): | ||
| output_file = data_dir / f"{output_split}.jsonl" | ||
| with open(output_file, "wt", encoding="utf-8") as fout: | ||
| for entry in dataset[hf_split]: | ||
| line = json.dumps(entry) + "\n" | ||
| fout.write(line) | ||
| test_aai_fout.write(line) |
There was a problem hiding this comment.
File test_aai.jsonl is being opened and written to in outer scope, but also written within nested loop for each split. The outer with block keeps the file open during the entire loop, meaning both dev.jsonl and test.jsonl entries are written sequentially. However, there's no guarantee the file is properly synced since writes happen concurrently through both fout and test_aai_fout file handles. While this may work, it's cleaner to write to test_aai.jsonl after the split files are complete.
Additionally, per CONTRIBUTING.md guidelines (line 40-42), avoid data loss by completing computation before writing. Consider writing all files separately first, then concatenating to avoid partial writes if there's a failure.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Signed-off-by: Igor Gitman <igitman@nvidia.com> Co-authored-by: George Armstrong <georgea@nvidia.com>
Signed-off-by: Igor Gitman <igitman@nvidia.com> Co-authored-by: George Armstrong <georgea@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>
Summary by CodeRabbit
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.