Sample scripts for training the Mask R-CNN model in the Penn-Fudan Database for Pedestrian Detection and Segmentation using PyTorch on DirectML
These scripts are collected from the tutorial here
Install the following prerequisites by running the following script from the root
directory of the DirectML folder:
pip install -r pytorch\cv\objectDetection\maskrcnn\requirements.txt
After installing the PyTorch on DirectML package (see GPU accelerated ML training), open a console to the root
directory and run the setup script to download and convert data:
python pytorch\cv\data\dataset.py
Running dataset.py
should take at least a minute or so, since it downloads the CIFAR-10 dataset. The output of running it should look similar to the following:
>python pytorch\cv\data\dataset.py
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to E:\work\dml\pytorch\cv\data\cifar-10-python\cifar-10-python.tar.gz
Failed download. Trying https -> http instead. Downloading http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to E:\work\dml\pytorch\cv\data\cifar-10-python\cifar-10-python.tar.gz
170499072it [00:32, 5250164.09it/s]
Extracting E:\work\dml\pytorch\cv\data\cifar-10-python\cifar-10-python.tar.gz to E:\work\dml\pytorch\cv\data\cifar-10-python
A helper script exists to train Mask R-CNN with PennFudanPed data:
cd pytorch\cv\objectdetection\maskrcnn
python .\maskrcnn.py
The first few lines of output should look similar to the following (exact numbers may change):
>python .\maskrcnn.py
python .\maskrcnn.py
Epoch: [0] [ 0/60] eta: 0:38:26 lr: 0.000090 loss: 2.9777 (2.9777) loss_classifier: 0.7217 (0.7217) loss_box_reg: 0.0754 (0.0754) loss_mask: 1.6228 (1.6228) loss_objectness: 0.4175 (0.4175) loss_rpn_box_reg: 0.1404 (0.1404) time: 38.4439 data: 1.0955
Epoch: [0] [10/60] eta: 0:29:44 lr: 0.000936 loss: 2.4268 (2.4919) loss_classifier: 0.4056 (0.4158) loss_box_reg: 0.1691 (0.3631) loss_mask: 1.1679 (1.1600) loss_objectness: 0.1162 (0.3120) loss_rpn_box_reg: 0.1257 (0.2410) time: 35.6972 data: 0.1034
Epoch: [0] [20/60] eta: 0:23:14 lr: 0.001783 loss: 1.2172 (1.6717) loss_classifier: 0.0669 (0.2410) loss_box_reg: 0.1331 (0.2466) loss_mask: 0.5935 (0.8376) loss_objectness: 0.0565 (0.1873) loss_rpn_box_reg: 0.0574 (0.1593) time: 34.6860 data: 0.0042