should we use caer.train_val_split or sklearn.model_selection when splitting data on line 62? #9

harryhancock · 2021-01-31T13:34:04Z

import matplotlib.pyplot as plt
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.callbacks import LearningRateScheduler
from sklearn.model_selection import train_test_split

IMG_SIZE = (80,80)
@@ -58,9 +59,10 @@
labels = to_categorical(labels, len(characters))

Creating train and validation data

split_data = caer.train_val_split(featureSet, labels, val_ratio=.2)

this produces error:

InvalidArgumentError Traceback (most recent call last)
in
47 validation_data=(x_val,y_val),
48 validation_steps=len(y_val)//BATCH_SIZE,
---> 49 callbacks = callbacks_list)
50
51 characters

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
1098 _r=1):
1099 callbacks.on_train_batch_begin(step)
-> 1100 tmp_logs = self.train_function(iterator)
1101 if data_handler.should_sync:
1102 context.async_wait()

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in call(self, *args, **kwds)
826 tracing_count = self.experimental_get_tracing_count()
827 with trace.Trace(self._name) as tm:
--> 828 result = self._call(*args, **kwds)
829 compiler = "xla" if self._experimental_compile else "nonXla"
830 new_tracing_count = self.experimental_get_tracing_count()

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
886 # Lifting succeeded, so variables are initialized and we can run the
887 # stateless function.
--> 888 return self._stateless_fn(*args, **kwds)
889 else:
890 _, _, _, filtered_flat_args = \

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in call(self, *args, **kwargs)
2941 filtered_flat_args) = self._maybe_define_function(args, kwargs)
2942 return graph_function._call_flat(
-> 2943 filtered_flat_args, captured_inputs=graph_function.captured_inputs) # pylint: disable=protected-access
2944
2945 @Property

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
1917 # No tape is watching; skip to running the function.
1918 return self._build_call_outputs(self._inference_function.call(
-> 1919 ctx, args, cancellation_manager=cancellation_manager))
1920 forward_backward = self._select_forward_and_backward_functions(
1921 args,

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in call(self, ctx, args, cancellation_manager)
558 inputs=args,
559 attrs=attrs,
--> 560 ctx=ctx)
561 else:
562 outputs = execute.execute_with_cancellation(

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
58 ctx.ensure_initialized()
59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60 inputs, attrs, num_outputs)
61 except core._NotOkStatusException as e:
62 if name is not None:

InvalidArgumentError: Can not squeeze dim[2], expected a dimension of 1, got 10
[[node binary_crossentropy/remove_squeezable_dimensions/Squeeze (defined at :49) ]] [Op:__inference_train_function_35716]

Function call stack:
train_function

OR
split_data = train_test_split(featureSet, labels, val_ratio=.2)

jasmcaus · 2021-01-31T13:54:10Z

Use sklearn.model_selection.train_test_split(featureSet, labels, val_ratio=.2) instead of caer.train_val_split(). The latter function is deprecated in caer.

So instead of this syntax (which was used in the course):
x_train, x_val, y_train, y_val = caer.train_val_split(featureSet, labels, val_ratio=.2)
use
x_train, x_val, y_train, y_val = sklearn.model_selection.train_test_split(featureSet, labels, val_ratio=.2)

locorez · 2021-01-31T17:48:33Z

Hi guys,
I had the same error than @harryhancock
After I have changed the syntax from:
x_train, x_val, y_train, y_val = caer.train_val_split(featureSet, labels, val_ratio=.2)

to
x_train, x_val, y_train, y_val = sklearn.model_selection.train_test_split(featureSet, labels, test_size=.2)

It has worked.

Thank you!

jasmcaus · 2021-02-09T05:56:01Z

That's great to hear @locorez! How about you @harryhancock? Did changing the syntax work for you?

jasmcaus closed this as completed Jul 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

should we use caer.train_val_split or sklearn.model_selection when splitting data on line 62? #9

should we use caer.train_val_split or sklearn.model_selection when splitting data on line 62? #9

harryhancock commented Jan 31, 2021

jasmcaus commented Jan 31, 2021 •

edited

Loading

locorez commented Jan 31, 2021

jasmcaus commented Feb 9, 2021

should we use caer.train_val_split or sklearn.model_selection when splitting data on line 62? #9

should we use caer.train_val_split or sklearn.model_selection when splitting data on line 62? #9

Comments

harryhancock commented Jan 31, 2021

Creating train and validation data

this produces error:

jasmcaus commented Jan 31, 2021 • edited Loading

locorez commented Jan 31, 2021

jasmcaus commented Feb 9, 2021

jasmcaus commented Jan 31, 2021 •

edited

Loading