Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce functionality #103

Merged
merged 12 commits into from
Oct 24, 2013
Merged

Reduce functionality #103

merged 12 commits into from
Oct 24, 2013

Conversation

guillermobox
Copy link
Contributor

Hello!

I have implemented the reduce function in the collection file.

This function reduces a known n dimensional matrix into a n-1 dimensional matrix, aplying a given callback to each element in the desired dimension to reduce. A callback should take two arguments, and return a single argument, like for example:

function sum(x, y){ return x+y; }

Now reducing a vector with this callback should go from: [1,2,3,4] to 10. The function has been tested with the 'max' procedure, look in the code to see how easy is now to apply the 'max' function to a matrix.

In theory, every "reducing" function (statistical functions, mostly) can be implemented using this "reduce" function, as it's commonly done in Scheme/Haskell/Python/Hadoop and some other functional-related programming languages.

Also, more testing in the reduce function are needed, specially empty cases and things like that, I know they are not implemented yet, but this is a proof of concept.

@josdejong
Copy link
Owner

Very nice, I really like your idea behind this reduce function.

One minor thing in the _reduce function, you test for a number like if( typeof mat[0] === "number" ){, but I think this variable can contain a couple of different types: number, string, complex, boolean. So maybe better to test whether !isArray(math[0])? (and testing for a number is safer with util.number.isNumber).

And shouldn't the reduce function be like this, so on Array input it returns an Array, and on Matrix input it returns a Matrix?

exports.reduce = function reduce (mat, dim, callback) {
  if (mat instanceof Matrix) {
    return new Matrix(_reduce(mat.valueOf(), dim, callback));
  }
  else {
    return _reduce(mat, dim, callback);
  }
}

@guillermobox
Copy link
Contributor Author

Yes, both your corrections are completely right. I'll make those changes and apply this code to both the mean and the min functions. I just wanted to know your opinion before working more on the code.

I'll change those two things as soon as I can.

@josdejong
Copy link
Owner

I like it :)

No need to hurry, take your time.

@guillermobox
Copy link
Contributor Author

Ok, that's it.

@josdejong
Copy link
Owner

That's fast, great work :)

From the tests it looks like the functions max, min, mean use a one-based dimension and reduce uses a zero-based dimension. shouldn't everything consistently use zero-based dimesions everywhere? (see #23, #66)

@guillermobox
Copy link
Contributor Author

Sure, I just used one-based indexes in max/min/mean as a silly reflex of another code I'm working on (has one based indexes in the external functions but works with zero-based in internal calls).

I'll change that and add more tests.

@josdejong
Copy link
Owner

Me the same - I did a lot of work in Matlab, where everything is one-based (as in most math applications)... See discussion #66. Now the math.js library is zero-based, while the expression parser is one-based. We may need a kind of one-based-to-zero-based mapping in the expression parser for these functions max, min, mean, concat.

@guillermobox
Copy link
Contributor Author

I have finished the changes and added more tests for the statistical functions, specially the 'mean' function, now is fully tested with a 4x3x2 matrix calculating the mean across the three dimensions. I think everything works fine. With this new reduce function, coupled with the already existing map function, a lot of matrix related operations can be simplified.

Regarding the parser I don't know how to change that at all (I have only written one or two parsers in my life, and mosly hello-world-like parsers, for educational purposes). So I guess that is work for other person.

josdejong added a commit that referenced this pull request Oct 24, 2013
@josdejong josdejong merged commit 13b7e6a into josdejong:develop Oct 24, 2013
@josdejong
Copy link
Owner

Super! Thanks, looks good.

Yes I will take care of changing the expression parser.

@guillermobox
Copy link
Contributor Author

Another thing related to this...

I couldn't find a function to get the list of dimensions of a matrix, so I have a small snippet of 3 lines in the 'mean' function to do it. I think it should be a good idea that apart from length (which returns the size of the first dimension), anothe function should return all the lengths, like from a 4x3x2 matrix, it should return [4,3,2], and for a single array of 4 elements maybe [4] or 4. Programming this is easy, only a recursive function that populates an array.

Anyway there are two problems here:

  1. I don't know where to write that particular function to be coherent with your actual model.
  2. There is a problem if the user creates a non-regular matrix, with more elements in one array than in the other, like this: [ [1,2,3], [4,5] ]. This is valid as a nested array, but what is the length of it? [2,3] or [2,2] ? It depends on which array you measure. In my opinion, an object of type Matrix should not be able to have that behaviour (it should always be a regular matrix), so my proposal of a function to calculate all the dimension lengths should have no problem.

I can open an issue if you want, but I'm not sure if this function already exists or not.

@josdejong
Copy link
Owner

Yes, this function already exists, it is called size. Implemented low level
in lib/util/array.js (only accepting Array), and also available as higher
level math.size (accepting any data type). The function validates the size
of the array and throws an error if you feed it something like [ [1,2,3],
[4,5] ].

Furthermore, the Matrix data type has a function Matrix.size() returning
its own size. This size is cached, so requesting the size does not cause
re-calculating and validating the matrix again and again.

On Thu, Oct 24, 2013 at 1:49 PM, guillermobox [email protected]:

Another thing related to this...

I couldn't find a function to get the list of dimensions of a matrix, so I
have a small snippet of 3 lines in the 'mean' function to do it. I think it
should be a good idea that apart from length (which returns the size of the
first dimension), anothe function should return all the lengths, like from
a 4x3x2 matrix, it should return [4,3,2], and for a single array of 4
elements maybe [4] or 4. Programming this is easy, only a recursive
function that populates an array.

Anyway there are two problems here:

  1. I don't know where to write that particular function to be coherent
    with your actual model.
  2. There is a problem if the user creates a non-regular matrix, with
    more elements in one array than in the other, like this: [ [1,2,3], [4,5]
    ]. This is valid as a nested array, but what is the length of it? [2,3] or
    [2,2] ? It depends on which array you measure. In my opinion, an object of
    type Matrix should not be able to have that behaviour (it should always be
    a regular matrix), so my proposal of a function to calculate all the
    dimension lengths should have no problem.

I can open an issue if you want, but I'm not sure if this function already
exists or not.


Reply to this email directly or view it on GitHubhttps://github.com//pull/103#issuecomment-26985613
.

@guillermobox
Copy link
Contributor Author

Oh ok, I changed it to use the new size function, this way the code is way more clear.

Can you re-open this pull request or do I have to make a new one?

@josdejong
Copy link
Owner

You will have to make a new pull request, you can only merge once

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants