Help getting going #72

34r7h · 2016-02-10T04:36:25Z

Newb to ML and NNets. What are some ways I could use synaptic to find hidden relationships in standard json objects?

Made a S.O query but doesn't seem very well received 👂
I have a long array of objects that are created to track daily actions.

Example:

[
    {name:'workout', duration:'120', enjoy: true, time:1455063275, tags:['gym', 'weights']},
    {name:'lunch', duration:'45', enjoy: false, time:1455063275, tags:['salad', 'wine']},
    {name:'sleep', duration:'420', enjoy: true, time:1455063275, tags:['bed', 'romance']}
]

I'm having a hard time understanding how to use this data in a neural network to predict if future actions would be enjoyable. Additionally, I want to find hidden relationships between various activities.

Not sure how to get the rubber on the road. How do I feed the network my array of objects and read the results?

If anyone can answer this within the context of https://github.com/cazala/synaptic that would be great. It's also super if the answer is a straight machine learning lesson.

Thanks all!
https://stackoverflow.com/questions/35304800/hidden-relationships-with-javascript-machine-learning

The text was updated successfully, but these errors were encountered:

thgie · 2016-03-25T16:02:43Z

Hey irthos. It's quite a while since you asked this question and I wanted to know in return, if you got any further. I studied the examples toroughly but I still can't wrap my head around how to apply synaptic to a real world scenario. How do I know, what to feed into the network, and what do I do with the output. You got any good sources so far?

cazala · 2016-03-25T18:50:00Z

Hi @irthos and @thgie. For neural nets to work you need to feed a training set, consisting of inputs and their desired outputs. In you case this would be like:

("workout",  120, ['gym', 'weights']) => 'is enjoyable'
("lunch",  45, ['salad', 'wine']) => 'is not enjoyable'
("sleep",  420, ['bed', 'romance']) => 'is enjoyable'

But neural networks don't know what "workout" or 45 or ['salad', 'wine'] are. They only understand a single input, containing only values between 0 and 1, and it has to have a fixed size, so all the inputs have the same length. So you need to normalize your input/output data.

The 'name' input can be normalized into categories. Let's say you have 3 categories: "workout", "lunch" and "sleep", each can be represented with a flag bit. So we can use 3 bits:

"workout" => 0, 0, 1
"lunch" => 0, 1, 0
"sleep" => 1, 0, 0

Then, the 'duration' can be normalized to a value between 0 and 1 setting up a maximum value, and dividing by it. So let's say your maximum duration is 1000, then your inputs would look like this:

120 => 0.12
45 => 0.045
420 = > 0.42

For the 'tags' categories you can use categories again, but combined. So let's say that you have 6 categories:

gym => 0,0,0,0,0,1
weights => 0,0,0,0,1,0
salad => 0,0,0,1,0,0
wine => 0,0,1,0,0,0
bed => 0,1,0,0,0,0
romance => 1,0,0,0,0,0

Then your inputs would look like this:

['gym', 'weights'] => 0,0,0,0,1,1
['salad', 'wine'] => 0,0,1,1,0,0
['bed', 'romance'] => 1,1,0,0,0,0

And finaly, 'enjoyable' is the easiest one since it is a boolean, it can be written as:

true => 1
false => 0

Putting all this together your training set would be like:

("workout",  120, ['gym', 'weights']) => 0,0,1 + 0.12, +  0,0,0,0,1,1 => [0,0,1,0.12,0,0,0,0,1,1]
("lunch",  45, ['salad', 'wine']) => 0,1,0 + 0.045, +  0,0,1,1,0,0 => [0,1,0,0.045,0,0,1,1,0,0] 
("sleep",  420, ['bed', 'romance']) => 1,0,0 + 0,42 + 1,1,0,0,0,0 => [1,0,0,0.42,1,1,0,0,0,0]

And your outputs:

true => 1 => [1]
false => 0 => [0]
true => 1 => [1]

Now, you need to translate this to synaptic. You need a network with 10 neurons in the input layer and 1 in the output layer (since that's the size of your inputs and outputs). You can choose from different Architecture. If the sequence of the set matters for the training, you need to use a network with context memory, like LSTM. If the sequence is not important, then you shoud use a Perceptron, which is context unaware.

What I mean is like, if this:

("workout",  120, ['gym', 'weights']) => true 
("lunch",  45, ['salad', 'wine']) => false 
("sleep",  420, ['bed', 'romance']) => true

Should or should not be the same as this:

("lunch",  45, ['salad', 'wine']) => false 
("workout",  120, ['gym', 'weights']) => true 
("sleep",  420, ['bed', 'romance']) => true

If, let's say, having lunch before working out would change the ouput of the sleep set, then you need to use a network that remembers it's previous activations (LSTM).

But to keep it simple let's say that the order of the set doesn't matter, and use a Perceptron.

var myNet = new Architect.Perceptron(10, 7, 1);

I created it with 10 inputs, 7 hidden neurons, and 1 output neuron. The number of hidden neurons can't be guessed straightforwardly usualy you use a number in between the number of inputs and outputs.

Now, you feed your training set to the network's trainer:

var trainingSet = [
  {
    input: [0,0,1,0.12,0,0,0,0,1,1]
    output: [1]
  },
  {
    input:  [0,1,0,0.045,0,0,1,1,0,0] 
    output: [0]
  },
  {
    input:  [1,0,0,0.42,1,1,0,0,0,0]
    output: [1]
  }
]

var trainingOptions = {
  rate: .1,
  iterations: 20000,
  error: .005,
}

myNet.trainer.train(trainingSet, trainingOptions);

The training options should be tweaked according to each special case, you can read more about the trainer options in the Trainer Documentation Page

I hope this helps :)

thgie · 2016-03-25T20:05:55Z

@cazala: That is by far the most exhaustive answer ever. I will study it with rigor! Thank you very much.

cazala · 2016-03-25T21:16:35Z

haha, I'm glad to help (:

thgie · 2016-03-26T10:07:55Z

Hey @cazala, that was indeed a very helpful answer. The normalization process was missing from my understanding of the whole process. I feel almost enlightened now. Thank you very much :)

34r7h · 2016-04-05T23:01:30Z

Bravo, @cazala

I'll look forward to reading this a few more times but you've answered the question in the best possible way. Many thanks!

qrpike · 2016-05-18T06:04:21Z

@cazala You should link to this on the README somewhere. This is the part that is hard to wrap your mind around when starting out. Thank you!

UniqueFool · 2016-05-18T19:50:50Z

agreed, that's an awesome answer, and could well serve as an excellent introduction

cazala · 2016-05-23T13:46:13Z

@qrpike @UniqueFool I just created a wiki page with this example. @irthos I hope you don't mind I used you data set in the wiki.

UniqueFool · 2016-05-23T14:06:07Z

that's looking very good, thanks for doing this - I didn't realize that, but it seems any of us could have done the same thing, i.e. create that wiki article and add to it.

Besides, are there any helpers/utilities in Synaptic that help with the normalization part ?

cazala · 2016-05-23T14:12:22Z

no, so far there are not, and I haven't seen much tools like that out there, other than underscore.normalize (which is a mixin, not an official underscore feature), but I haven' used it myself.

UniqueFool · 2016-05-23T14:19:30Z

just wondering, because skicit-learn (Python) and most other toolkits offer a library with normalization tools, which obviously helps getting started - for instance, they even provide a diagram like this: http://scikit-learn.org/stable/tutorial/machine_learning_map/

chris-rcn · 2016-06-17T06:30:35Z

It is inaccurate to say, "They only understand a single input, containing only values between 0 and 1". The a network will handle negative inputs just as well as positive inputs. And there is nothing special about the value "1". Values close to zero are preferred, so staying around -1 to 1 is not a bad idea. But -2 to 2 is fine, too.

The outputs are a different story. They are always going to be 0 to 1.

https://en.wikipedia.org/wiki/Sigmoid_function#/media/File:Logistic-curve.svg

skerit · 2016-06-24T17:04:58Z

The wiki page helps a lot, but I still struggle with some of the "trainer" options.

The explanation of error, for example: error: minimum error

That doesn't really tell me what it does. The same for learning rate and cost.
I'm not even sure if 'log' is working, because I've set it to 1000, with 1000 iterations, but I get no feedback.

Could those things be expanded upon?

cooper09 · 2018-06-11T20:47:43Z

That's great for one or two lines of data, but what do you do with a CVS file that can be hundreds, even thousands of rows long. How do you "automate" the process

lucasBertola mentioned this issue Apr 1, 2016

Basketball AI #81

Open

34r7h closed this as completed Apr 5, 2016

tymmesyde mentioned this issue Aug 30, 2021

LSTM is extremely slow ( 2 hours ) BrainJS/brain.js#470

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help getting going #72

Help getting going #72

34r7h commented Feb 10, 2016

thgie commented Mar 25, 2016

cazala commented Mar 25, 2016

thgie commented Mar 25, 2016

cazala commented Mar 25, 2016

thgie commented Mar 26, 2016

34r7h commented Apr 5, 2016

qrpike commented May 18, 2016

UniqueFool commented May 18, 2016 •

edited

Loading

cazala commented May 23, 2016

UniqueFool commented May 23, 2016

cazala commented May 23, 2016

UniqueFool commented May 23, 2016

chris-rcn commented Jun 17, 2016 •

edited

Loading

skerit commented Jun 24, 2016

cooper09 commented Jun 11, 2018

Help getting going #72

Help getting going #72

Comments

34r7h commented Feb 10, 2016

thgie commented Mar 25, 2016

cazala commented Mar 25, 2016

thgie commented Mar 25, 2016

cazala commented Mar 25, 2016

thgie commented Mar 26, 2016

34r7h commented Apr 5, 2016

qrpike commented May 18, 2016

UniqueFool commented May 18, 2016 • edited Loading

cazala commented May 23, 2016

UniqueFool commented May 23, 2016

cazala commented May 23, 2016

UniqueFool commented May 23, 2016

chris-rcn commented Jun 17, 2016 • edited Loading

skerit commented Jun 24, 2016

cooper09 commented Jun 11, 2018

UniqueFool commented May 18, 2016 •

edited

Loading

chris-rcn commented Jun 17, 2016 •

edited

Loading