Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Augmentations on GPU #8

Open
jhagege opened this issue Nov 5, 2018 · 4 comments
Open

Augmentations on GPU #8

jhagege opened this issue Nov 5, 2018 · 4 comments

Comments

@jhagege
Copy link

jhagege commented Nov 5, 2018

Hi, great code !
I have been noticing GPU usage is a bit low (around 40%), and trying to optimize.
I've been noticing that HLSTransform is very CPU intensive.
Are you aware of any way to have it executed on GPU instead of CPU ?
Do you think it could help ?
Thanks

@cypw
Copy link
Owner

cypw commented Nov 9, 2018

I haven't found any HLS implementation on GPU. It might be helpful if the color augmentation could be done on the GPU side.

Besides considering reducing the cost of data augmentation, you can also consider reducing the cost of decoding video files. Actually, for Kinetics dataset, I found that convert the default *.mp4 using the command below can significantlty speed up the decoding stage:

For example:

ffmpeg -y -i ${SRC_VID} -c:v mpeg4 -filter:v "scale=min(iw\,(256*iw)/min(iw\,ih)):-1" -b:v 512k -an ${DST_VID}

@jhagege
Copy link
Author

jhagege commented Nov 9, 2018

Thanks much for your feedback, this is helpful.
Will give it a look.

@jhagege
Copy link
Author

jhagege commented Nov 15, 2018

@cypw By the way, did you try converting videos to h264 / h265 ?
Did you notice a significant improvement with mpeg4 compared to those ?
Thanks !

@georkap
Copy link

georkap commented Dec 3, 2019

Hi, this comes a bit late but removing numpy functions as much as possible and using cv2 equivalents in the __call__ function in the RandomHLS augmentation saves significant cpu processing time. Essentially, substituting the np.minimum and np.maximum. Snippet below, hope it helps.

def __call__(self, data):
    assert data.ndim == 3, 'cannot operate on a single channel'
    h, w, c = data.shape
    assert c % 3 == 0, "input channel = %d, illegal" % c
    num_ims = c//3

    random_vars = tuple(int(round(self.rng.uniform(-x, x))) for x in (self.vars + [0]))
    augmented_data = np.zeros(data.shape, dtype=np.uint8)

    for i_im in range(0, num_ims): # for every image do the magic
        start, end = 3*i_im, 3*(i_im+1)
        augmented_data[:, :, start:end] = cv2.cvtColor(data[:, :, start:end], cv2.COLOR_RGB2HLS)
        augmented_data[:, :, start:end] = cv2.add(augmented_data[:, :, start:end], random_vars, dtype=cv2.CV_8UC3)
        mask = cv2.inRange(augmented_data[:, :, start], 0, 180)
        augmented_data[mask == 0, start] = 180
        augmented_data[:, :, start:end] = cv2.cvtColor(augmented_data[:, :, start:end], cv2.COLOR_HLS2RGB)

    return augmented_data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants