Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Asking for advices about tuning wavenet #3558

Closed
shuokay opened this issue Oct 18, 2016 · 5 comments
Closed

Asking for advices about tuning wavenet #3558

shuokay opened this issue Oct 18, 2016 · 5 comments

Comments

@shuokay
Copy link
Contributor

shuokay commented Oct 18, 2016

I am trying to reproduce the result of WaveNet, and the code is in https://github.com/shuokay/mxnet-wavenet
I have worked on this for several days, but the net still can't converge. I beg some advices to help me tune the training process.

At present, I am suspicious of three points:

  1. MNXet convolution op can't pad only one side, so I pad the data by Concat
  2. The SoftmaxOutput, I am not familiar with multi-output, I think there may be something wrong in the current implementation.
  3. the dilate bug. I modified the convolution code according to What will mxnet convolution do if the dilate shape is greater than the input shape #3479. Is there some one can confirm the fix?
@piiswrong
Copy link
Contributor

Try to simplify the network to only a few layers and see if you can get it to converge on synthetic data

@shuokay
Copy link
Contributor Author

shuokay commented Oct 19, 2016

Apologize to everyone.
The net do convergence in fact. The problem is the evaluation metric.
The mx.metric.MAE doesn't match this net. The pred is between 0 and 1 of shape (batch_size, channels, height, width), when compute the update of EvalMetric, the right process is, first compute the argmax of pred on axis=1, then compute the mae(or mse)
so I defined new EvalMetric

class MYMAE(mx.metric.EvalMetric):
    """Calculate Mean Absolute Error loss"""

    def __init__(self):
        super(MYMAE, self).__init__('mymae')

    def update(self, labels, preds):
        check_label_shapes(labels, preds)
        for label, pred in zip(labels, preds):
            label = label.asnumpy()
            pred = pred.asnumpy()
            if len(label.shape) == 1:
                label = label.reshape(label.shape[0], 1)
            self.sum_metric += numpy.abs(label - numpy.argmax(pred, axis=1).reshape(label.shape)).mean()
            self.num_inst += 1 # numpy.prod(label.shape)

@zihaolucky
Copy link
Member

@shuokay Great~

@zihaolucky
Copy link
Member

@shuokay How it works? Does it work very well?

@yajiedesign
Copy link
Contributor

This issue is closed due to lack of activity in the last 90 days. Feel free to reopen if this is still an active issue. Thanks!

This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants