Skip to content

Conversation

@wqj2004
Copy link

@wqj2004 wqj2004 commented Jan 16, 2025

Overview

The whole RLHF training pipeline

onlinerlhf

Goal

In this PR, we need to finish the collector part in the above figure and add its unittest.

TODO

  • add vllm inferencer for LLM/VLM
  • scale up vllm inferencer in multi-GPUs
  • add several datasets definition @PaParaZz1 feature(nyz): add rlhf dataset #854
  • survey proper VQA dataset for RLHF training
  • add the final collector and test it on the above dataset
  • add tutorial and API doc
  • (optional) add search tools @PaParaZz1

@wqj2004 wqj2004 changed the title vllm_test.py in ding/worker add vllm_test.py in ding/worker(wqj) Jan 17, 2025
@wqj2004 wqj2004 changed the title add vllm_test.py in ding/worker(wqj) feature(wqj):add vllm_test.py in ding/worker Jan 17, 2025
@PaParaZz1 PaParaZz1 added the enhancement New feature or request label Jan 19, 2025
@PaParaZz1 PaParaZz1 changed the title feature(wqj):add vllm_test.py in ding/worker feature(wqj): add vllm_test.py in ding/worker Jan 19, 2025
stop_token_ids = None
return prompts,stop_token_ids

def get_multi_modal_input(modality,filenames,questions):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add python typing lint

#img_names=['/mnt/afs/niuyazhe/data/meme/data/Eimages/Eimages/Eimages/image_ (2)']
num_prompts=len(questions)
image_repeat_prob=None
modality = 'image'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use enum class to control this field rather than the naive string

This was referenced Jan 24, 2025
@PaParaZz1 PaParaZz1 changed the title feature(wqj): add vllm_test.py in ding/worker feature(wqj): add vllm rlhf collector Feb 6, 2025
@PaParaZz1 PaParaZz1 closed this Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants