Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prediction with MXNET Cuda version not working #331

Closed
InzamamAnwar opened this issue Sep 20, 2019 · 2 comments
Closed

Prediction with MXNET Cuda version not working #331

InzamamAnwar opened this issue Sep 20, 2019 · 2 comments
Labels
bug Something isn't working

Comments

@InzamamAnwar
Copy link

Description

DeepAR model trained with mxnet-cu 1.5.0 is giving error "vector::_M_range_insert"

To Reproduce

Data cannot be shared. Code is given below

trainer_config = Trainer(epochs= 400, batch_size=32, num_batches_per_epoch=75, 
                         clip_gradient=20.0, weight_decay=1e-6, init = "xavier"
                        )
estimator = DeepAREstimator(freq="W", prediction_length=52, trainer=trainer_config, context_length=52, num_layers=3, num_cells=52, cell_type="lstm", dropout_rate=0.15, use_feat_dynamic_real=True, use_feat_static_cat=True, cardinality=[5, 22, 90, 1343, 45, 3], embedding_dimension=100, distr_output=GaussianOutput(), scaling=False)

predictor = estimator.train(training_data=train_data)

for test_entry, forecast in zip(test_data, predictor.predict(test_data)):
    to_pandas(test_entry).plot(linewidth=2)
    forecast.plot(color='g', prediction_intervals=[50.0, 90.0])
    break
plt.grid(which='both')
plt.legend()

Error Message

---------------------------------------------------------------------------
MXNetError                                Traceback (most recent call last)
<ipython-input-13-41efef67ca9e> in <module>
----> 1 for test_entry, forecast in zip(test_ds, predictor.predict(test_ds)):
      2     to_pandas(test_entry).plot(linewidth=2)
      3     forecast.plot(color='g', prediction_intervals=[50.0, 90.0])
      4     break
      5 plt.grid(which='both')

~/anaconda3/envs/gluonts/lib/python3.7/site-packages/gluonts/model/predictor.py in predict(self, dataset, num_eval_samples)
    303         for batch in inference_data_loader:
    304             inputs = [batch[k] for k in self.input_names]
--> 305             outputs = self.prediction_net(*inputs).asnumpy()
    306             if self.output_transform is not None:
    307                 outputs = self.output_transform(batch, outputs)

~/anaconda3/envs/gluonts/lib/python3.7/site-packages/mxnet/gluon/block.py in __call__(self, *args)
    546             hook(self, args)
    547 
--> 548         out = self.forward(*args)
    549 
    550         for hook in self._forward_hooks.values():

~/anaconda3/envs/gluonts/lib/python3.7/site-packages/mxnet/gluon/block.py in forward(self, x, *args)
    923                     params = {i: j.data(ctx) for i, j in self._reg_params.items()}
    924 
--> 925                 return self.hybrid_forward(ndarray, x, *args, **params)
    926 
    927         assert isinstance(x, Symbol), \

~/anaconda3/envs/gluonts/lib/python3.7/site-packages/gluonts/model/deepar/_network.py in hybrid_forward(self, F, feat_static_cat, past_time_feat, past_target, past_observed_values, future_time_feat)
    583             static_feat=static_feat,
    584             scale=scale,
--> 585             begin_states=state,
    586         )

~/anaconda3/envs/gluonts/lib/python3.7/site-packages/gluonts/model/deepar/_network.py in sampling_decoder(self, F, static_feat, past_target, time_feat, scale, begin_states)
    520 
    521             # (batch_size * num_samples, 1, *target_shape)
--> 522             new_samples = distr.sample()
    523 
    524             # (batch_size * num_samples, seq_len, *target_shape)

~/anaconda3/envs/gluonts/lib/python3.7/site-packages/gluonts/distribution/transformed_distribution.py in sample(self, num_samples)
     80     def sample(self, num_samples: Optional[int] = None) -> Tensor:
     81         with autograd.pause():
---> 82             s = self.base_distribution.sample(num_samples=num_samples)
     83             for t in self.transforms:
     84                 s = t.f(s)

~/anaconda3/envs/gluonts/lib/python3.7/site-packages/gluonts/distribution/gaussian.py in sample(self, num_samples)
     87             mu=self.mu,
     88             sigma=self.sigma,
---> 89             num_samples=num_samples,
     90         )
     91 

~/anaconda3/envs/gluonts/lib/python3.7/site-packages/gluonts/distribution/distribution.py in _sample_multiple(sample_func, num_samples, *args, **kwargs)
    245         k: _expand_param(v, num_samples) for k, v in kwargs.items()
    246     }
--> 247     samples = sample_func(*args_expanded, **kwargs_expanded)
    248     return samples

~/anaconda3/envs/gluonts/lib/python3.7/site-packages/mxnet/ndarray/register.py in sample_normal(mu, sigma, shape, dtype, out, name, **kwargs)

~/anaconda3/envs/gluonts/lib/python3.7/site-packages/mxnet/_ctypes/ndarray.py in _imperative_invoke(handle, ndargs, keys, vals, out)
     90         c_str_array(keys),
     91         c_str_array([str(s) for s in vals]),
---> 92         ctypes.byref(out_stypes)))
     93 
     94     if original_output is not None:

~/anaconda3/envs/gluonts/lib/python3.7/site-packages/mxnet/base.py in check_call(ret)
    251     """
    252     if ret != 0:
--> 253         raise MXNetError(py_str(_LIB.MXGetLastError()))
    254 
    255 

MXNetError: vector::_M_range_insert

Environment

  • Operating system: Ubuntu 16.04
  • Python version: 3.7.4
  • GluonTS version: 0.3.3
  • MXNET: mxnet-cu100 1.5.0

Hack
Save trained model and load it again in an environment having mxnet without cuda (1.4.0)

@InzamamAnwar InzamamAnwar added the bug Something isn't working label Sep 20, 2019
@vafl
Copy link
Contributor

vafl commented Sep 20, 2019

GluonTS currently only supports MxNet 1.4.1.

There is a bug in mxnet 1.5 (see this PR and the linked issue #245), which prevents us from upgrading. I think your issue is related to this. Could you try with mxnet 1.4.1? Thanks

@InzamamAnwar
Copy link
Author

@vafl MxNet 1.4.1 with CUDA 10.0 is working. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants