Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subset of matrix should always return the same type (matrix) #2344

Open
petres opened this issue Oct 30, 2021 · 10 comments
Open

subset of matrix should always return the same type (matrix) #2344

petres opened this issue Oct 30, 2021 · 10 comments

Comments

@petres
Copy link

petres commented Oct 30, 2021

Depending on the index the function subset returns a matrix or a scalar.
How can I set that the return type is always a matrix? Is there another function?

@josdejong
Copy link
Owner

Have you tried the configuration option predictable?

@josdejong
Copy link
Owner

I just tried it but I see predictable is not implemented there.

so, what we would like is:

math.config({ predictable: false })
math.subset(math.matrix([1,2,3,4]), math.index(1)) // number 2

math.config({ predictable: true })
math.subset(math.matrix([1,2,3,4]), math.index(1)) // Matrix/Array containing one value 2

Anyone interested in making subset adhere to the predictable config and always return a matrix in that case? Help would be welcome

@petres
Copy link
Author

petres commented Nov 5, 2021

subset should always return a Matrix. If the user wants to extract an element of that matrix he could use get. In the case he doesn't know about the returned dimensionality (because the param index is dynamic) she will always prefer to get the same type.

@josdejong
Copy link
Owner

What "the user" wants depends on the use case and which user you're talking about 😄 .

In a programmatic JavaScript environment it indeed makes sense to use two different, explicit methods get and subset, each giving back predictable results.

The method subset (and the whole of mathjs) was originally created though for end users using the expression parser to do mathematics. It focuses on making it easy to work with matrices, and was heavily inspired by how you can work with matrices in mathematical applications like Matlab, Octave, Maple, Mathematica.

So in the expression parser you do things like the following, where it is implicit whether you get back a value or a matrix:

A = [1, 2; 3, 4]
myvalue = A[1, 2]    # will return number 2
mysubset = A[2, 1:2] # will return the second row [[3, 4]]

You could argue to make an explicit API in the expression parser too, but this is how the mathematical world is used to work and it works like a charm.

Similarly, a function like sqrt will normally return a complex result instead of a number with value NaN when you enter a negative value, like sqrt(-4). That makes the function unpredictable. In some cases like mathematical end users this is desirable, in other cases like programmatically using the function you want a predictable output type. That's why the configuration option predictable was introduced, and I think this issue will be addressed by extending subset to adhere to this config option.

@dvd101x
Copy link
Collaborator

dvd101x commented Jul 19, 2023

By checking other references I think the logic is more in line of the type of the input instead of the size of the index so that if every range in index comes from a scalar then it yields a scalar.

A = [1, 2; 3, 4]
A[1, 2]      # yields  2
A[1, [2]]    # yields [2]
A[[1], 2]    # yields [2]
A[1:1, 2]    # yields [2]
A[1, 2:2]    # yields [2]
A[2, 1:2]    # yields [3, 4]
A[[2], 1:2]  # yields [[3, 4]]

I see also some differences on higher dimension arrays regarding if it comes from an array or comes from a range.

Some of that shows promising capabilities but there is still utility in checking the size of the index.

@gwhitney
Copy link
Collaborator

This issue came up in #2994, and I think there is a clear convention that could/should be adopted here that will eliminate any use of the "predictable" mechanism. That is to say, I would propose that indexing any dimension with a scalar eliminates that dimension, whereas indexing it with an array leaves that dimension in place, regardless of the length of the array. For example, if A =[1,2,3; 4,5,6] then (with 1-based indexing as in the parser):

A[2,3]  --> 6
A[[1,2],[2,3]] --> [2,3; 5,6]
A[[2],[2,3]] --> [[5, 6]]
A[2,[2,3]] --> [5,6]
A[[1,2],[2]] --> [2; 5]  // which is the same as [[2], [5]]
A[[1,2],2] --> [2,5]
A[[2], [2]] --> [[5]]
A[2, [2]] --> [5]
A[[2], 2] --> [5]

This way, the type of the result is exactly predictable just from the types of the indices; it's compatible with boolean indexing as recently implemented; and I believe it covers the original poster's use-case. Thoughts? The change from current behavior is that for example all three of the last expressions currently return just 5. Conceptually, it would not be the count of entries that determines the type of the result, but just the types of the indices.

There are some edge cases with empty vectors for indices:

A[1,[]] --> []  // Seems clear
A[[],2] --> []  // Seems OK, no rows.
A[[],[]] --> ??? // Seems like it should be 2-D, but it also seems like it should have no rows. Is there any distinction
// between a 1-D and 2-D empty matrix in mathjs? Right now, I don't think so... There is [[]], but that seems to have one (empty) row, rather than being 2D but having no rows...

but I think these things could be ironed out in some reasonable way, I don't think they detract from the basic proposal.

@josdejong
Copy link
Owner

Ow, that is a great idea! I love it. It definitely makes sense to me.

@dvd101x
Copy link
Collaborator

dvd101x commented Jul 27, 2023

It also makes sense to me.

As a reference for the edge cases, in numpy it would be like this:

A[[],[]] --> []

The same would apply for N dimensions:

A[[],2,3]   --> []
A[1,[],3]   --> []
A[1,2,[]]   --> []
A[1,[],[]]  --> []
A[[],2,[]]  --> []
A[[],[],3]  --> []
A[[],[],[]] --> []

@gwhitney
Copy link
Collaborator

gwhitney commented Oct 6, 2023

This is actually a duplicate of the much older #1484, but since the discussion here is more in-depth, I'm actually going to close the older one.

@DbCrWk
Copy link

DbCrWk commented Oct 6, 2023

This is actually a duplicate of the much older #1484, but since the discussion here is more in-depth, I'm actually going to close the older one.

I originally opened #1484, but didn’t get around to opening a PR. This resolution looks great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants