You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On windows I got an error code of 0xC0000374 and on linux I got an error message "corrupted double-linked list", both seems memory problem. I do the training on CPU. The codes are like the following:
try(Model model = Model.newInstance("time-series")){
NDManager nd = model.getNDManager();
NDArray inputs = nd.create(new float[][]{
{1.0f,2.0f,3.0f,4.0f,5.0f,6.0f},
{2.0f,3.0f,4.0f,5.0f,6.0f,7.0f},
{3.0f,4.0f,5.0f,6.0f,7.0f,8.0f},
{4.0f,5.0f,6.0f,7.0f,8.0f,9.0f},
{5.0f,6.0f,7.0f,8.0f,9.0f,10.0f},
{6.0f,7.0f,8.0f,9.0f,10.0f,11.0f},
{7.0f,8.0f,9.0f,10.0f,11.0f,12.0f},
{8.0f,9.0f,10.0f,11.0f,12.0f,13.0f},
{9.0f,10.0f,11.0f,12.0f,13.0f,14.0f},
{10.0f,11.0f,12.0f,13.0f,14.0f,15.0f}
});
Shape inputShape = inputs.getShape();
long cnt = inputShape.get(0);
long dur = inputShape.get(1);
long predDur = 2L;
long trainDur = 3L;
long start = dur-trainDur-predDur-1;
NDArray encoderInputs = inputs.get(":,"+start+":"+(start+trainDur)).reshape(new Shape(cnt,trainDur,1L));
NDArray decoderInputs = inputs.get(":,"+(start+trainDur)+":"+(start+trainDur+predDur)).reshape(new Shape(cnt,predDur,1L));
int batchSize = 1;
ArrayDataset trainingDataset = new ArrayDataset.Builder()
.setData(encoderInputs)
.optLabels(decoderInputs)
.setSampling(batchSize,false)
.build();
Encoder encoder = new SimpleTextEncoder(LSTM.builder()
.setNumStackedLayers(1)
.setStateSize(2)
.build());
Decoder decoder = new SimpleTextDecoder(LSTM.builder()
.setNumStackedLayers(1)
.setStateSize(2)
.build(),1);
EncoderDecoder net = new EncoderDecoder(encoder,decoder);
model.setBlock(net);
Loss loss = Loss.l1Loss();
Tracker tracker = Tracker.fixed(0.001f);
Optimizer optimizer = Optimizer.sgd().setLearningRateTracker(tracker).build();
TrainingListener[] listeners = TrainingListener.Defaults.logging();
TrainingConfig config = new DefaultTrainingConfig(loss).optOptimizer(optimizer).addTrainingListeners(listeners);
int numEpochs = 10;
try(Trainer trainer = model.newTrainer(config)){
trainer.initialize(encoderInputs.getShape(),decoderInputs.getShape());
for (int epoch = 0; epoch < numEpochs; epoch++) {
EasyTrain.fit(trainer,numEpochs,trainingDataset,null);
}
}
}
The lib versions are djl-0.9.0 and mxnet-1.7.0. and the crash point seems always on EasyTrain.java line 83 collector.backward(lossValue). Why the backward fails and how to solve it then? Thanks!
The text was updated successfully, but these errors were encountered:
On windows I got an error code of 0xC0000374 and on linux I got an error message "corrupted double-linked list", both seems memory problem. I do the training on CPU. The codes are like the following:
The lib versions are djl-0.9.0 and mxnet-1.7.0. and the crash point seems always on EasyTrain.java line 83 collector.backward(lossValue). Why the backward fails and how to solve it then? Thanks!
The text was updated successfully, but these errors were encountered: