Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should we use caer.train_val_split or sklearn.model_selection when splitting data on line 62? #9

Closed
harryhancock opened this issue Jan 31, 2021 · 3 comments

Comments

@harryhancock
Copy link

@jasmcaus pls help!

import matplotlib.pyplot as plt
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.callbacks import LearningRateScheduler
from sklearn.model_selection import train_test_split

IMG_SIZE = (80,80)
@@ -58,9 +59,10 @@
labels = to_categorical(labels, len(characters))

Creating train and validation data

split_data = caer.train_val_split(featureSet, labels, val_ratio=.2)

this produces error:

InvalidArgumentError Traceback (most recent call last)
in
47 validation_data=(x_val,y_val),
48 validation_steps=len(y_val)//BATCH_SIZE,
---> 49 callbacks = callbacks_list)
50
51 characters

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
1098 _r=1):
1099 callbacks.on_train_batch_begin(step)
-> 1100 tmp_logs = self.train_function(iterator)
1101 if data_handler.should_sync:
1102 context.async_wait()

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in call(self, *args, **kwds)
826 tracing_count = self.experimental_get_tracing_count()
827 with trace.Trace(self._name) as tm:
--> 828 result = self._call(*args, **kwds)
829 compiler = "xla" if self._experimental_compile else "nonXla"
830 new_tracing_count = self.experimental_get_tracing_count()

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
886 # Lifting succeeded, so variables are initialized and we can run the
887 # stateless function.
--> 888 return self._stateless_fn(*args, **kwds)
889 else:
890 _, _, _, filtered_flat_args = \

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in call(self, *args, **kwargs)
2941 filtered_flat_args) = self._maybe_define_function(args, kwargs)
2942 return graph_function._call_flat(
-> 2943 filtered_flat_args, captured_inputs=graph_function.captured_inputs) # pylint: disable=protected-access
2944
2945 @Property

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
1917 # No tape is watching; skip to running the function.
1918 return self._build_call_outputs(self._inference_function.call(
-> 1919 ctx, args, cancellation_manager=cancellation_manager))
1920 forward_backward = self._select_forward_and_backward_functions(
1921 args,

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in call(self, ctx, args, cancellation_manager)
558 inputs=args,
559 attrs=attrs,
--> 560 ctx=ctx)
561 else:
562 outputs = execute.execute_with_cancellation(

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
58 ctx.ensure_initialized()
59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60 inputs, attrs, num_outputs)
61 except core._NotOkStatusException as e:
62 if name is not None:

InvalidArgumentError: Can not squeeze dim[2], expected a dimension of 1, got 10
[[node binary_crossentropy/remove_squeezable_dimensions/Squeeze (defined at :49) ]] [Op:__inference_train_function_35716]

Function call stack:
train_function

OR
split_data = train_test_split(featureSet, labels, val_ratio=.2)

@jasmcaus
Copy link
Owner

jasmcaus commented Jan 31, 2021

Use sklearn.model_selection.train_test_split(featureSet, labels, val_ratio=.2) instead of caer.train_val_split(). The latter function is deprecated in caer.

So instead of this syntax (which was used in the course):
x_train, x_val, y_train, y_val = caer.train_val_split(featureSet, labels, val_ratio=.2)
use
x_train, x_val, y_train, y_val = sklearn.model_selection.train_test_split(featureSet, labels, val_ratio=.2)

@locorez
Copy link

locorez commented Jan 31, 2021

Hi guys,
I had the same error than @harryhancock
After I have changed the syntax from:
x_train, x_val, y_train, y_val = caer.train_val_split(featureSet, labels, val_ratio=.2)

to
x_train, x_val, y_train, y_val = sklearn.model_selection.train_test_split(featureSet, labels, test_size=.2)

It has worked.

Thank you!

@jasmcaus
Copy link
Owner

jasmcaus commented Feb 9, 2021

That's great to hear @locorez! How about you @harryhancock? Did changing the syntax work for you?

@jasmcaus jasmcaus closed this as completed Jul 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants