Skip to content

MMGPT with modified code - earlier MMGPT repo has been archieved and we have moved to FloRA repo

License

Notifications You must be signed in to change notification settings

vermaprince17/FloRA

Repository files navigation

🤖 Multi-modal GPT

Train a multi-modal chatbot with visual and language instructions!

Based on the open-source multi-modal model OpenFlamingo, we create various visual instruction data with open datasets, including VQA, Image Captioning, Visual Reasoning, Text OCR, and Visual Dialogue. Additionally, we also train the language model component of OpenFlamingo using only language-only instruction data.

The joint training of visual and language instructions effectively improves the performance of the model!

Features

  • Support various vision and language instruction data
  • Parameter efficient fine-tuning with LoRA
  • Tuning vision and language at the same time, complement each other

Installation

To install the package in an existing environment, run

https://github.com/vermaprince17/FloRA.git
cd FloRA
pip install -r requirements.txt
pip install -v -e .

or create a new conda environment

conda env create -f environment.yml

Launch Demo Locally

  1. Download the pre-trained weights.

    Use this script for converting LLaMA weights to Hugging Face format.

    Download the OpenFlamingo pre-trained model from openflamingo/OpenFlamingo-9B.

    Download our LoRA Weight from here.

    Then place these models in checkpoints folders like this:

    checkpoints
    ├── llama-7b_hf
    │   ├── config.json
    │   ├── pytorch_model-00001-of-00002.bin
    │   ├── ......
    │   └── tokenizer.model
    ├── OpenFlamingo-9B
    │   └──checkpoint.pt
    ├──mmgpt-lora-v0-release.pt
    
    
  2. launch the gradio demo

    python app.py
  3. single example inference : (execute command in inference_cmd.txt; Assumes that there is Flamingo ckpts in checkpoints/OpenFlamingo-9B/checkpoint.pt

    python inference.py <path to language ckpts> <path to fine tuned ckpts> <text input> <path to image input>

    example: python inference.py openlm-research/open_llama_3B_V2 prod/run_LLama-aokvaq-train_ds_8k/ckpt_per_steps/checkpoint_0_2176.pt "What is this image content?" ./docs/images/demo_image.jpg

Examples

Recipe:

image4

Travel plan:

image3

Movie:

image2

Famous person:

image

Fine-tuning

Prepare datasets

  1. A-OKVQA

    Download annotation from this link and unzip to data/aokvqa/annotations.

    It also requires images from coco dataset which can be downloaded from here.

  2. COCO Caption

    Download from this link and unzip to data/coco.

    It also requires images from coco dataset which can be downloaded from here.

  3. OCR VQA

    Download from this link and place in data/OCR_VQA/.

  4. LlaVA

    Download from liuhaotian/LLaVA-Instruct-150K and place in data/llava/.

    It also requires images from coco dataset which can be downloaded from here.

  5. Mini-GPT4

    Download from Vision-CAIR/cc_sbu_align and place in data/cc_sbu_align/.

  6. Dolly 15k

    Download from databricks/databricks-dolly-15k and place it in data/dolly/databricks-dolly-15k.jsonl.

  7. Alpaca GPT4

    Download it from this link and place it in data/alpaca_gpt4/alpaca_gpt4_data.json.

You can also customize the data path in the configs/dataset_config.py.

  1. Baize

    Download it from this link and place it in data/baize/quora_chat_data.json.

  2. PubMedQA

    Download it from this link and place it in data/pubmedqa/ori_pqal.json.

Start training

torchrun --nproc_per_node=8 mmgpt/train/instruction_finetune.py \
  --lm_path checkpoints/llama-7b_hf \
  --tokenizer_path checkpoints/llama-7b_hf \
  --pretrained_path checkpoints/OpenFlamingo-9B/checkpoint.pt \
  --run_name train-my-gpt4 \
  --learning_rate 1e-5 \
  --lr_scheduler cosine \
  --batch_size 1 \ 
  --tuning_config configs/lora_config.py \
  --dataset_config configs/dataset_config.py \
  --report_to_wandb

Acknowledgements

About

MMGPT with modified code - earlier MMGPT repo has been archieved and we have moved to FloRA repo

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •