Image2Pormpt- Make it easy to write pormpts

Base Model：BLIP2 in LAVIS

[Model Release] Jan 2023, released implementation of BLIP-2
Paper, Project Page,

A generic and efficient pre-training strategy that easily harvests development of pretrained vision models and large language models (LLMs) for vision-language pretraining. BLIP-2 beats Flamingo on zero-shot VQAv2 (65.0 vs 56.3), establishing new state-of-the-art on zero-shot captioning (on NoCaps 121.6 CIDEr score vs previous best 113.2). In addition, equipped with powerful LLMs (e.g. OPT, FlanT5), BLIP-2 also unlocks the new zero-shot instructed vision-to-language generation capabilities for various interesting applications!

Introduction

Image-to-prompt is a Python deep learning model for generating prompt from image for text-to-image tasks.

Installation

(Optional) Creating conda environment

conda create -n lavis python=3.8
conda activate lavis

install from PyPI

pip install salesforce-lavis

Or, for development, you may build from source

git clone https://github.com/salesforce/LAVIS.git
cd LAVIS
pip install -e .

Getting Started

Model Zoo

Model are in google drive, to view:

https://drive.google.com/file/d/1IGxTKDwQX5o3-C6ttZNJwt5j2Sbvjtk1/view?usp=share_link

Image Captioning

In this example, we use the BLIP model to generate a prompt for the image. To make inference even easier, we also associate each pre-trained model with its preprocessors (transforms), accessed via load_model_and_preprocess().

model_path are modified in

lavis/configs/models/blip2/blip2_caption_opt2.7b.yaml
finetuned: local_path

import torch
from lavis.models import load_model_and_preprocess
from PIL import Image

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model, vis_processors, _ = load_model_and_preprocess(name="blip2_opt", model_type="caption_coco_opt2.7b", is_eval=True, device=device)
raw_image = Image.open("docs/_static/rooster.jpg").convert("RGB")
# preprocess the image
# vis_processors stores image transforms for "train" and "eval" (validation / testing / inference)
image = vis_processors["eval"](raw_image).unsqueeze(0).to(device)
# generate caption
res = model.generate({"image": image})
print("res: {}".format(res))
#['rooster in oriental armor pattern, kung fu style, intricate, high resolution, art style, kirby, kirby art,']

Contact us

If you have any questions, comments or suggestions, please do not hesitate to contact us at [email protected].

License

BSD 3-Clause License

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
app		app
assets		assets
dataset_card		dataset_card
docs		docs
examples		examples
lavis		lavis
projects		projects
run_scripts		run_scripts
tests/models		tests/models
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
README.md		README.md
SECURITY.md		SECURITY.md
evaluate.py		evaluate.py
inference.py		inference.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image2Pormpt- Make it easy to write pormpts

Base Model：BLIP2 in LAVIS

Introduction

Installation

Getting Started

Model Zoo

Image Captioning

Contact us

License

About

Releases

Packages

Languages

License

MichaelFan01/imagetoprompt

Folders and files

Latest commit

History

Repository files navigation

Image2Pormpt- Make it easy to write pormpts

Base Model：BLIP2 in LAVIS

Introduction

Installation

Getting Started

Model Zoo

Image Captioning

Contact us

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages