ops.image.affine_transform() does not work as a layer in GPU #20191

kwchan7 · 2024-08-30T22:55:19Z

Hi,

I notice ops.image.affine_transform() does not work as part of a model in GPU
TF version: 2.16.1
keras version: 3.5.0

Some observations from some testing

model.predict_step() works, but model.predict() does not
it works if using CPU only
other similar functions such as ops.image.resize(), ops.image.pad_images() work ok.

Samples code as below

import tensorflow as tf
import keras
from keras import layers
from keras import ops
import numpy as np

print('TF version: {:s}'.format(tf.__version__))
print('keras version: {:s}'.format(keras.__version__))

shape = (20,18,1)
inputs = layers.Input(shape)
transform = ops.stack([1, 0, 0, 0, 1, 0, 0, 0], axis = 0)

if 1:
    img = ops.image.affine_transform(
        inputs,
        transform,
        )
else:
    img = ops.image.resize(
        inputs,
        (10,9),
    )

y = layers.Flatten()(img)
outputs = layers.Dense(1)(y)
model = keras.Model(inputs,outputs)
model.summary()

x = np.random.uniform(-1,1,(10000,*shape))
yp = model.predict_step(x)
print(yp)
yp = model.predict(x)
print(yp)

The text was updated successfully, but these errors were encountered:

ghsanti · 2024-08-31T10:46:43Z

Just adding extra info.

similar issue
It seems that GPU-XLA-JIT struggles to optimise the affine-transform operation, being un-registered (ig this means: tf can't turn this op to run on GPU-XLA-JIT):

InvalidArgumentError: Graph execution error:
Detected unsupported operations when trying to compile graph (...) on XLA_GPU_JIT: ImageProjectiveTransformV3 (No registered 'ImageProjectiveTransformV3' OpKernel for XLA_GPU_JIT ...

Disabling JIT works, but may not be desired

model.compile(jit_compile=False)

Somewhere they recommend to use bilinear interpolation but that doesn't help when jit_compile is done.

You could also try jit_scope, but it seems more complicated.

Edit:

Related post.

fchollet · 2024-09-10T01:49:46Z

Can you try JAX? I'd like to see if this is an XLA issue or a TF issue.

ghsanti · 2024-09-10T10:45:26Z

Done several tests (gist), I'll try to summarise below @fchollet :

TF: table here also shows that the operation does not support GPU. There are tickets asking for it since 2021.
Torch CPU & GPU:

--> 365     transform = torch.reshape(transform, (batch_size, 3, 3))
    366     offset = transform[:, 0:2, 2].clone()
    367     offset = torch.nn.functional.pad(offset, pad=[0, 1, 0, 0])

RuntimeError: shape '[10000, 3, 3]' is invalid for input of size 9

JAX

Same error than TORCH GPU for both CPU & GPU (see observation below.)

Using transform = torch.reshape(transform, (1, 3, 3)) instead of transform, (batch_size, 3, 3)) seems to run fine in jax (cpu,gpu), torch (cpu).
Same by replacing batch_size=10000 by batch_size=1.

Torch GPU returns RuntimeError: "baddbmm_cuda" not implemented for 'Int'

fchollet · 2024-09-10T16:25:48Z

Torch GPU returns RuntimeError: "baddbmm_cuda" not implemented for 'Int'

For that one, you can simply cast your input to float32 (you can cast it back to int afterwards if you need ints)

Same error than TORCH GPU for both CPU & GPU (see observation below.)

What is the JAX error message? I don't understand the PyTorch error message (as is often the case with those).

ghsanti · 2024-09-10T17:13:25Z

@fchollet

cast your input to float32

should that be done automatically within the layer?

What is the JAX error message? I don't understand the PyTorch error message (as is often the case with those).

The line ( in this case just seems misuse of the fn since it takes a state. ):

Error one (line and screenshot)

yp = model.predict_step(x)

Error two (line and screenshot)

And commenting that one out, the next error at model.predict(x) is:

Last error seems == to Torch's, just batch_size = 32

(note that im not OP just reading out of curiosity.)

github-actions bot assigned sachinprasadhs Aug 30, 2024

sachinprasadhs added stat:awaiting keras-eng Awaiting response from Keras engineer type:Bug labels Sep 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ops.image.affine_transform() does not work as a layer in GPU #20191

ops.image.affine_transform() does not work as a layer in GPU #20191

kwchan7 commented Aug 30, 2024

ghsanti commented Aug 31, 2024 •

edited

Loading

fchollet commented Sep 10, 2024

ghsanti commented Sep 10, 2024 •

edited

Loading

fchollet commented Sep 10, 2024

ghsanti commented Sep 10, 2024 •

edited

Loading

ops.image.affine_transform() does not work as a layer in GPU #20191

ops.image.affine_transform() does not work as a layer in GPU #20191

Comments

kwchan7 commented Aug 30, 2024

ghsanti commented Aug 31, 2024 • edited Loading

fchollet commented Sep 10, 2024

ghsanti commented Sep 10, 2024 • edited Loading

fchollet commented Sep 10, 2024

ghsanti commented Sep 10, 2024 • edited Loading

Error one (line and screenshot)

Error two (line and screenshot)

ghsanti commented Aug 31, 2024 •

edited

Loading

ghsanti commented Sep 10, 2024 •

edited

Loading

ghsanti commented Sep 10, 2024 •

edited

Loading