This folder contains the examples of image segmentation in MXNet.
we have trained a simple fcn-xs model, the parameter is below:
model | lr (fixed) | epoch |
---|---|---|
fcn-32s | 1e-10 | 31 |
fcn-16s | 1e-12 | 27 |
fcn-8s | 1e-14 | 19 |
(when using the newest mxnet, you'd better using larger learning rate, such as 1e-4, 1e-5, 1e-6 instead, because the newest mxnet will do gradient normalization in SoftmaxOutput ) |
the training image number is only : 2027, and the Validation image number is: 462
- Install python package
Pillow
(required byimage_segment.py
).
[sudo] pip install Pillow
- Assume that we are in a working directory, such as
~/train_fcn_xs
, and MXNet is built as~/mxnet
. Now, copy example scripts into working directory.
cp ~/mxnet/example/fcn-xs/* .
- vgg16fc model : you can download the
VGG_FC_ILSVRC_16_layers-symbol.json
andVGG_FC_ILSVRC_16_layers-0074.params
baidu yun, dropbox.
this is the fully convolution style of the origin VGG_ILSVRC_16_layers.caffemodel, and the corresponding VGG_ILSVRC_16_layers_deploy.prototxt, the vgg16 model has license for non-commercial use only. - experiment data : you can download the
VOC2012.rar
robots.ox.ac.uk, and Extract it. the file/folder will be like:
JPEGImages folder
,SegmentationClass folder
,train.lst
,val.lst
,test.lst
- Configure GPU/CPU for training in
fcn_xs.py
.
# ctx = mx.cpu(0)
ctx = mx.gpu(0)
- if you want to train the fcn-8s model, it's better for you trained the fcn-32s and fcn-16s model firstly.
when training the fcn-32s model, run in shell
./run_fcnxs.sh
, the script in it is:
python -u fcn_xs.py --model=fcn32s --prefix=VGG_FC_ILSVRC_16_layers --epoch=74 --init-type=vgg16
- in the fcn_xs.py, you may need to change the directory
root_dir
,flist_name
, ``fcnxs_model_prefix``` for your own data. - when you train fcn-16s or fcn-8s model, you should change the code in
run_fcnxs.sh
corresponding, such as when train fcn-16s, comment out the fcn32s script, then it will like this:
python -u fcn_xs.py --model=fcn16s --prefix=FCN32s_VGG16 --epoch=31 --init-type=fcnxs
- the output log may like this(when training fcn-8s):
INFO:root:Start training with gpu(3)
INFO:root:Epoch[0] Batch [50] Speed: 1.16 samples/sec Train-accuracy=0.894318
INFO:root:Epoch[0] Batch [100] Speed: 1.11 samples/sec Train-accuracy=0.904681
INFO:root:Epoch[0] Batch [150] Speed: 1.13 samples/sec Train-accuracy=0.908053
INFO:root:Epoch[0] Batch [200] Speed: 1.12 samples/sec Train-accuracy=0.912219
INFO:root:Epoch[0] Batch [250] Speed: 1.13 samples/sec Train-accuracy=0.914238
INFO:root:Epoch[0] Batch [300] Speed: 1.13 samples/sec Train-accuracy=0.912170
INFO:root:Epoch[0] Batch [350] Speed: 1.12 samples/sec Train-accuracy=0.912080
- similarly, you should firstly download the pre-trained model from yun.baidu, the symbol and model file is
FCN8s_VGG16-symbol.json
,FCN8s_VGG16-0019.params
- then put the image in your directory for segmentation, and change the
img = YOUR_IMAGE_NAME
inimage_segmentaion.py
- lastly, use
image_segmentaion.py
to segmentation one image by run in shellpython image_segmentaion.py
, then you will get the segmentation image like the sample result above.
- this is the whole image size training, that is to say, we do not need resize/crop the image to the same size, so the batch_size during training is set to 1.
- the fcn-xs model is baed on vgg16 model, with some crop, deconv, element-sum layer added, so the model is some big, moreover, the example is using whole image size training, if the input image is some large(such as 700*500), then it may very memory consumption, so I suggest you using the GPU with 12G memory.
- if you don't have GPU with 12G memory, maybe you shoud change the
cut_off_size
to be a small value when you construct your FileIter, like this:
train_dataiter = FileIter(
root_dir = "./VOC2012",
flist_name = "train.lst",
cut_off_size = 400,
rgb_mean = (123.68, 116.779, 103.939),
)
- we are looking forward you to make this example more powerful, thanks.