-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【Not An Issue But Tutorial】A complete tutorial of pre-processing steps #45
Comments
Thanks for the detailed instructions. Is there any reason you choose CIHP-PGN instead of Self-Correction-Human-Parsing (https://github.com/GoGoDuck912/Self-Correction-Human-Parsing) for human parsing? Can I use rembg (https://github.com/danielgatis/rembg) instead of carvekit to generate cloth mask? |
Absolutely, all I used is mentioned by the author in Preprocessing.md. If you find a better solution to do it, feel free to use it. |
"One thing to mention, the images provided by official dataset keep both visualization (colored) and label (0 - 20). I don't know how they did that. I also tried P mode in PIL, but found nothing." I think they put color palette before saving the images. (https://pillow.readthedocs.io/en/stable/_modules/PIL/Image.html#Image.putpalette) |
Thank you for your explanation! I believe this will help others who want to reproduce this in the future. |
Thank you for such great explanation. |
An update on the Openpose Installing section: You may want to change the last line of code if you run into some problem with cuDNN on the Colab, from
to
This solution is from CMU-Perceptual-Computing-Lab/openpose/issues/1527 |
Is there any constraint on images for Human Parse? like img resolution or anything.. I am getting human parse for first 4 images (from CIHP_PGN repo test folder), but not for 5th image. Any help is much appreciated!! |
Dear lujiazho, Regarding human parse, the preprocessing.md says "I inferenced a parse map on 256x192 resolution, and upsample it to 1024x768. Then you can see that it has a alias artifact, so I smooth it using "torchgeometry.image.GaussianBlur((15, 15), (3, 3))". I saved a parse map image using PIL.Image with P mode. The color of the parse map image in our dataset(VITON-HD) is just for the visualization, it has 0~19 uint values." I applied "torchgeometry.image.GaussianBlur((15, 15), (3, 3))" however the np.unique of the image from [ 0 2 5 9 10 13 14 15] to [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15], is there any way to realize the smooth and keep unique classes consistent? Looking forward to your feedback! |
It's been a long time since I played with this. I guess you mess up with the image that contains labels, which makes the labels of [ 0 2 5 9 10 13 14 15] changed. |
I followed the steps on openpose on google colab, but got this error. I am sure cudnn is already installed, but openpose cannot recognize it. Any advice? /content |
@ljcljc I believe it's not possible to run OpenPose on Google Colab, despite trying various methods. However, I switched to the PyTorch version of OpenPose, which has limited keypoints for the 'pose_keypoints_2d' body part. Fortunately, the results were nearly identical to the agnostic version of the pre-processed data, so there's no need to worry about that. |
Hello @lujiazho , I appreciate the effort you've put in. I would like to inquire about 'image-parse-agnostic-v3.2', Have you attempted to train the HR-VITON model using this approach? Specifically, I am interested in knowing whether the model train without errors and was able to reproduce the results of the authors? also should I put this image under 'image-parse-agnostic-v3.2' folder, or just like you did in the code : |
First of all, thanks for elaborating on the preprocessing. I'm having trouble following along with the human parsing step, since I also needed to downscale the input image before passing it through CIHP_PGN due to my low performance system. After running the gaussian blur, how do you make a clarified image? I tried implementing my own program to "round" pixels to the nearest valid colour, giving me the following image: However, this doesn't work well with the model. My guess is that the outline of the segments are too random. Afterwards, I found @thuongmhh 's comment:
I used The colours used are very messed up, but I found that the shape is similar to the one in the dataset. On a side note, using all the colours listen in the CIHP_PGN repo gives me a black image for some reason. Could anyone please explain how I should handle the upscaling of this image? Thanks in advance. |
Just trying to help you @Mingy123 , I thnik it will be simpler to use this cleaned code for CIHP_PGN, aslo I would like to mention that the output will be colored and non-colored images, all you need is the non-colored images for training (Human parse) and also to get the agnostic cloth/human map. |
@MosbehBarhoumi Thanks for the reply. The code you linked helped speed things up :) |
@lujiazho Can you tell that should I change the train_pairs.txt , images of cloth and person which are fed to model are different So should I make the input cloth image and the person image same ? Please Reply ASAP |
Hi @lujiazho , I am a beginner, there are many steps I am not sure how to complete, how can I use the HR-VITON project after I have completed the above steps? |
@lazyseacow run the train condition python file and then run the train generator file |
Hi @lujiazho Base-Image: Output Image: I have been scratching my head for so long time. I am able to generate a proper output with VITON-HD cloths. But whenever i am using any image to generate a mask it doesnot work. Any help will be deeply appreciated!!!!!! |
Never mind i solved it (img[...,0]==130)&(img[...,1]==130)&(img[...,2]==130) used 130 instead of 0 at this line |
rewrite: |
I use the cleaned version of CIHP, the generated cihp_edge_maps are completely white, and cihp_parsing_maps are completely black. Can anyone help? |
image-parse-v3 can be got by Graphonomy? |
Hey i have tried this but when i am using code & output from pytorch-openpose for generating |
Thanks for your great Tutorial. I find that the data in dataset is aligned in some method. Can you share your code about how to align it if I want to test image downloaded from web. Thanks! |
I also encountered the same problem,have you solved it yet? |
I have completed all the steps, run the 'test_generator.py' this error: |
At the Human Parse (step 2) could you show me how to get the result just like in the original dataset (the image below) |
To clarify my issue, my input image's shape was (192, 256) then upsampled to (768, 1024) just like the author said. I |
openpose models gives download errors, please find the models here:
you can use |
For the DensePose using detectron2, when i follow the command using anaconda on windows:
I keep getting this error:
After I went through the GETTING_STARTED.md of the repo and removed the
Just a heads up for anyone getting the same problem as me and save you some time :) |
In openpose part: run openpose.bin!./build/examples/openpose/openpose.bin --image_dir ../image_path --hand --disable_blending --display 0 --write_json ../json_path --write_images ../img_path --num_gpu 1 --num_gpu_start 0 |
I had the same trouble, you can work around this by using mediapipe for landmark detection and then converting the results to the openpose json format. |
I did a reserch on how you can do a quick upsample of an image. This may be a useful paper here: McGuire, Morgan, and Mara Gagiu. "MMPX style-preserving pixel art magnification." Journal of Graphics Techniques (January 2021) 36 (2021). The idea comes from console emulators that need to keep the original style while enhancing the image quality. Such methods also include Nearest, EPX, and XBR. All of them, like MMPX, do not add new colors when upsampling, but they also do not refer pixels on boundaries to other classes. I chose MMPX since it's one of the recent articles in this thread. You need to clone the repository (https://github.com/ITotalJustice/mmpx) and build it using cmake. It is important to add SHARED to CMakeLists.txt to get a ".so" file, not ".a": cd mmpx; mkdir build; cd build; cmake ..; make; The following code calls MMPX from the shared object file: import cv2
import numpy as np
import ctypes as ct
# Load the MMPX library
mmpx_lib = ct.cdll.LoadLibrary("mmpx/build/libmmpx.so")
mmpx_scale2x = mmpx_lib.mmpx_scale2x
mmpx_scale2x.argtypes = [ct.POINTER(ct.c_uint32), ct.POINTER(ct.c_uint32), ct.c_uint32, ct.c_uint32]
mmpx_scale2x.restype = None
def upscale_image(input_image_path, output_image_path):
# Load the image using OpenCV
image = cv2.imread(input_image_path)
srcHeight, srcWidth, _ = image.shape
dstHeight, dstWidth = 2 * srcHeight, 2 * srcWidth
# Convert image to buffer
srcBuffer = np.zeros(srcHeight * srcWidth, dtype=np.uint32)
for y in range(srcHeight):
for x in range(srcWidth):
b, g, r = image[y, x]
srcBuffer[y * srcWidth + x] = (r << 16) | (g << 8) | b
# Create buffer for the result
dstBuffer = np.zeros(dstHeight * dstWidth, dtype=np.uint32)
# Call the mmpx_scale2x function
mmpx_scale2x(
(ct.c_uint32 * len(srcBuffer)).from_buffer_copy(srcBuffer),
(ct.c_uint32 * len(dstBuffer)).from_buffer(dstBuffer),
ct.c_uint32(srcWidth),
ct.c_uint32(srcHeight)
)
# Convert buffer to image
result_image = np.frombuffer(dstBuffer, dtype=np.uint32).reshape((dstHeight, dstWidth))
# Extract color channels
red_channel = (result_image >> 16) & 255
green_channel = (result_image >> 8) & 255
blue_channel = result_image & 255
# Combine channels into RGB image
result_image_bgr = np.stack((blue_channel, green_channel, red_channel), axis=-1)
# Save the result
cv2.imwrite(output_image_path, result_image_bgr.astype(np.uint8))
# Call the function with the image path
upscale_image('CIHP_PGN/output/cihp_parsing_maps/00013_00_vis.png', '00013_00_x2.png')
upscale_image('00013_00_x2.png', '00013_00_x4.png') |
Hi! Thank you so much for this tutorial, I am really grateful! |
@lujiazho - I tried the above steps for this image |
Getting error in this code 'no such directory exists'. PLEASE provide solution. |
It is possible to use my own garments and models instead of dataset that present in the github |
@MuhammadHashir28 Have you figured out the problem that caused this? I am having a similar issue where the black region painted for parse agnostic is accurate but the grey region painted for human agnostic is way off. |
After investigating for 3 days, I got almost all done except for some minor problems. Here is a link of personal made study case of HR-VITON
Pre
According to explanation from authors: Preprocessing.md. At least a few steps are needed for getting all required inputs of model.
Most of those are reproduced on Colab, except Human Parse, which needs Tensorflow 1.15 and GPU is highly prefered.
1、OpenPose(On colab, need GPU)
(1) Install OpenPose, taking about 15 minutes
Now, OpenPose will be installed under your current path.
(2) Get all needed models
(3) Prepare your test data
(4)Run
Then json files will be saved under ../json_path and images will be saved under ../img_path.
The image result looks like
More details about results can be found at openpose
2、Human Parse
In this section, you can either do it on Colab, Cloud, or local. Unfortunately, I didn't successfully make use of GPU on Colab, and I can only use CPU, which is super slow when image size at 768 × 1024 (about 13 minutes per image).
Method 1: Colab
If you can accept, then install Tensorflow 1.15, before which you have to change Python version to 3.7 or 3.6.
(1) Get pretrained model
unzip
(2) Get repo
Note: I just saved the repo and cleaned it for my own purpose, but you can use official provided code as well.
(3) Prepare data and model
(4) Configuration
Change to Python 3.6
Install dependencies (Tensorflow 1.15)
(5) Run
now you can run your code
Note: In official repo, the file is named inf_pgn.py, which leads to the same result as mine.
Finally, you can get result looks like
More details can be found at CIHP_PGN
Method 2: Local or Server
In this section, I will give more explanation about what we really need.
You need conda in this part, which is what I used at least.
(1) Create a new env for oldschool Tensorflow
(2) Configuration
install GPU dependencies: cudatoolkit=10.0 cudnn=7.6.5
install Tensorflow 1.15 GPU
You may need to install below in a new env
More info about compatibility between Tensorflow and CUDA can be found here
(3) Prepare data, repo and model as mentioned before
A final dir looks like
So you basically just put model under checkpoint/CIHP_pgn
And put data under datasets/images
It can be just a few images of people. A repo of my cleaned version can be found at Google Drive. Feel free to download it. If you use official provided inf_pgn.py, same results will be generated.
(4) Run
Then you should see the output. Unfortunately, I didn't make it inference with GPU, no matter on server or local.
At local, my GPU is MX250 with 2G memory, which is not enough for inference.
At server, the GPU is RTX A5000, but for some unknown reason, probably something incompatible, the GPU is not invoked for inference. But model is successfully loaded into GPU though.
Fortunately, the server I used has 24 Cores and supports 2 threads per Core, which make it running still fast (20 to 30 seconds per 768×1024 image) even with CPU.
Final result looks like
However, the result inferenced with input of 768×1024 is not the same as input of 192×256. The former looks worse as shown above.
Note: The black images are what we really need, because the values of colored one are for example 0, 51, 85, 128, 170, 221, 255, which are not from 0 - 20 and inconsistant with HR-VITON. The values of black one are for example 0, 2, 5, 10, 12, 13, 14, 15, which are needed as labels for getting agnostic images.
One thing to mention, the images provided by official dataset keep both visualization (colored) and label (0 - 20). I don't know how they did that. I also tried P mode in PIL, but found nothing.
3、DensePose (On colab, GPU or CPU)
(1) get repo of detectron2
(2) install dependencies
(3) install packages for DensePose
(4) Prepare your images
(5) Modify code
At the time I used DensePose, there are some bugs, I have to modify some code to make it work as I want it to. When you follow this tutorial, situation may change.
This modification is because above change is not enough, image_target_bgr = image_bgr * 0 made a copy instead of a reference and lost our result.
(6) Run
If you are using CPU, add
--opts MODEL.DEVICE cpu
to end of below command.Then you can get results look like
4、Cloth Mask (On colab, GPU or CPU)
This is a lot easier.
(1) Install
(2) Download models
(3) Prepare cloth images
prepare dir for results
(4) Run
Make sure your cloth mask results are the same size with input cloth image (768×1024). And looks like
Note: you may have to change above code to get the right results, because sometimes the generated results are different, and I didn't investigate to much about this tool. Especially the line of
idx = (img[...,0]==0)&(img[...,1]==0)&(img[...,2]==0)
, you may get results of 0 or 130 as background depending on the model you use and settings.5、Parse Agnostic (On colab)
Here is the parse label and corresponding body parts. You may need or not.
(1) Install packages
(2) Prepare data
After all above steps, now you should have a data structure like this, they are under directory of test. If you are not sure which results locate in which dir, check out official dataset structure, you can download it from here.
You can zip them into test.zip and unzip them on Colab with
!unzip test.zip
.Note: the images under image-parse-v3 (black images with label) are not looking the same as official data (colored images with label), the reason has been mentioned before.
(3) Run
You can check results under ./test/parse. But it's all black as well. To ensure you are getting the right agnostic parse images, do below
The output may look like
The first row is longer than the second row.
You can also visualize it by
result may be like, which is cloth-agnostic
Save all the images under parse to image-parse-agnostic-v3.2
6、Human Agnostic
Steps are almost the same as above section.
(1) install
(2) Prepare data
Now it looks like
(3) Run
Results look like
Save them to dir of agnostic-v3.2. Now you are almost done. The final structure of preprocessing results are
7、Conclusion
Thanks for reading. It's not easy to get all this done. Before you run HR-VITON with you preprocessed dataset, note that each person image need a corresponding cloth image even though it's not used while inference. If you don't want this behavior, you can either
change the source code manually
or justadd some random images with the same name of person images
. After all done, suppose you are testing 5 people images and 3 cloth images, which are all unpaired, you should end up with 3 images undercloth
dir and 3 images undercloth-mask
; and 5 images under each other dirs:agnostic-v3.2
,image
,image-densepose
,image-parse-agnostic-v3.2
,image-parse-v3
,openpose_img
, andopenpose_json
.Final test result
The text was updated successfully, but these errors were encountered: