Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LSTM is extremely slow ( 2 hours ) #470

Closed
lynxionxs opened this issue Oct 20, 2019 · 11 comments
Closed

LSTM is extremely slow ( 2 hours ) #470

lynxionxs opened this issue Oct 20, 2019 · 11 comments

Comments

@lynxionxs
Copy link

Hi. I'm trianing an LSTM network that takes 2 hours to train 1000 training data with only 2000 iterations. Why is it so very slow?

// learns if string is like a date

// get training data
const trainingData = [
    {"input":"33 minutes ago","output":"yes"},
    {"input":"20 hours ago","output":"yes"},
    {"input":"May 7 at 13:42 AM","output":"yes"},
    {"input":"Feb 21 at 8:43 AM","output":"yes"},
    {"input":"Jul 22, 2012 ","output":"yes"},
    {"input":"Apr 14, 2018 ","output":"yes"}
    // 1000 total...
]

const network = new brain.recurrent.LSTM();

// create configuration for training
const config = {
    iterations: 2000,
    log: true,
    logPeriod: 200,
    layers: [10],
    log(detail) {
        console.log(detail);
    }
};

network.train(trainingData, config);

const output = network.run('Apr 6, 2014');

console.log(`Is like a date: ${(output == 'yes') ? 'YES' : 'NO' }`);
@lynxionxs lynxionxs changed the title LSTM is extremely slow LSTM is extremely slow ( 2 hours ) Oct 20, 2019
@tymmesyde
Copy link

Try to replace your outputs with 0 or 1 instead of 'no' / 'yes' since you only have two type of outputs
Maybe it will speed up the process a little

@tymmesyde
Copy link

I didn't pay more attention to your trainingSet in my previous answer, but I see that you are trying to train your network with dates in the form of strings.
I suggest you to take a different approach by normalizing your data:

Maybe try to see what the similarities are in your data and use them in your trainingSet as inputs instead of a literal string
Example: hasSpaces, hasNumber, hasStrings, hasCommas, hasYear, ...
(I suggest you add more of those)

Make a short script before, reviewing all your data by checking if it meets the above criteria.
(e.g. if hasSpaces, then your first input value should be 1, and so on)

Then fill your trainingSet like that:

{
   input: [1, 0, 1, 1, 0], output: [0],
   input: [1, 1, 1, 1, 1], output: [1]
}

And when you want to use run, go through to same process

@lynxionxs
Copy link
Author

@tymmesyde That makes sense now. Thanks

@Shubbair
Copy link

@tymmesyde thank you so much

@ninjaferrari90
Copy link

The training takes too long i.e. somewhere around 14 hours or even more..
My data set is having different output for different input strings..

Is there a way i can reduce my training time? Although i am storing the trained results in JSON and using that while retrieving the output..

Using Node with brain.js version 2.0.0-alpha.11

const trainingData = [
{"input":"How are you","output":"very well"},
{"input":"How have you been","output":"very well"},
{"input":"welcome to new york","output":"thanks"},
{"input":"welcome to our city","output":"thanks"},
{"input":"welcome to usa","output":"thanks"},
{"input":"Lets catchup today","output":"ofcourse"},
{"input":"Lets meet today","output":"ofcourse"}
...... many more....
//large data set with different responses for different scenarios..
]

const network = new brain.recurrent.LSTM();

// create configuration for training
const config = {
iterations: 10000,
log: true,
};

network.train(trainingData, config);

@Paulsy10
Copy link

Paulsy10 commented Aug 30, 2021

I am having the exact same problem as above. Can someone please help me?

  1. I am getting different outputs for different inputs... and the bot is so... unintelligent.
  2. I just have like 4 text input and outputs to train and it takes like 30 minutes.

@tymmesyde
Copy link

@ninjaferrari90 @Paulsy10 , this is not a bot, this is a neural net, a component to build your bot.
Read my answer above, you cannot feed plain text data to the neural net, you need to normalize your dataset first.

@Paulsy10
Copy link

I am still a bit confused on what you mean by "normalizing"... Do you mean to use numbers instead of strings?

@tymmesyde
Copy link

I am still a bit confused on what you mean by "normalizing"... Do you mean to use numbers instead of strings?

Read the issue and my comment in response to this issue, this is explained in details.

@Paulsy10
Copy link

I am still a bit confused on what you mean by "normalizing"... Do you mean to use numbers instead of strings?

Read the issue and my comment in response to this issue, this is explained in details.
I have a different situation where I am not using dates and just training it to chat. I don't see any similiarties with my text.

@tymmesyde
Copy link

tymmesyde commented Aug 30, 2021

This is the same issue, you need to normalize your dataset.
It isn't meant to be fed with text data. However you can give it numeric values, ranging from 0 to 1.
You need to find a way to translate those text values into readable ones for the net.

In your case if you just want to make a simple answers bot, you only need to create a dictionary of strings like so:

Questions:

How are you = 0
How have you been = 1
...

Answers:

very well = 0
thanks = 1
...

Then normalize your data by using your dictionary to translate these values to a [0,1] range.

{
    input: [0], // How are you
    output: [0] // very well
},
{
    input: [1], // How have you been
    output: [0] // very well
}
...

More on the matter here:
https://github.com/cazala/synaptic/wiki/Normalization-101
https://github.com/adadgio/neural-data-normalizer
cazala/synaptic#72

Hope it helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants