Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generalized repmat function as repeat #3605

Merged
merged 3 commits into from
Aug 7, 2013
Merged

Conversation

johnmyleswhite
Copy link
Member

This pull request provides a function called repeat that generalizes both R's rep function and Matlab repmat function. It allows the user to repeat the inner entries of a tensor as many times as desired and to also repeat outer slices of the tensor as many times as desired.

The major question I have is where the function fld1 should be placed: if it's not generally useful, I can hide it more.

@ViralBShah
Copy link
Member

I believe there is an open discussion on meshgrid. In some cases, it is useful to have a RepeatMatrix type, which keeps the representation compact but overloads indexing operations and such. In some other cases, the cost of creating the repeated matrix (largely memory usage) may be small compared to overhead of accessing it. Clearly, one needs both.

One thought is that routines such as ndgrid, repmat, repeat, can create these RepeatedMatrix types, and you can then do something like [ a ] to force the conversion to a Matrix.

This is just food for thought - and should not hold up this PR in any way. This discussion should possibly be captured elsewhere.

@johnmyleswhite
Copy link
Member Author

I'm not an experienced Matlab user, but my understanding is that meshgrid is effectively a way to compute Cartesian products. Is that right? If meshgrid behaves like R's expand.grid, then it has quite different behavior from what repeat is trying to do. repeat is just an attempt to have short notation for many recurring patterns in statistics -- like saying that the first 1000 rows of a matrix belong to Condition 1 and the second belong to Condition 2. This came up a lot while demoing out Vega.jl, which is why I finally focused on finishing this function. repeat would solve that problem by letting you write things like: repeat([1, 2], inner = [1000]).

Producing a RepeatMatrix type would be nice in many cases. I suspect I'd often want to get a standard editable Array back and not a repeated version, but having both sounds really helpful.

@ViralBShah
Copy link
Member

Could you add repeat to the help as well?

indices_in[t] = fld1(indices_in[t], inner[t])
end
end
index_in = sub2ind(size_A, tuple(indices_in[1:ndims_A]...)...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The computation of indices seem to be very expensive, and such computation is repeated for each element ..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can avoid that computation for vectors and matrices easily by writing special case functions, which will probably handle most use cases. I couldn't figure out how to express this computation for higher-order tensors using direct linear indexing.

@johnmyleswhite
Copy link
Member Author

I'll write special case code to improve performance, then write up the function for help.

Extend ind2sub and sub2ind to accept a reusable array of indices
Expose rem1 and fld1, defined by analogy with mod1
@johnmyleswhite
Copy link
Member Author

@ViralBShah, I've added repeat to the help docs.

@lindahua, I've tried to make repeat faster by writing specialized ind2sub and sub2ind functions. We could try to write branches to handle special cases to make the code faster for matrices, but this revision of repeat is already only 2x slower than repmat on my machine while being substantially more general. With profiling I suspect we can make the gap smaller still.

@timholy
Copy link
Member

timholy commented Jul 24, 2013

I've been previously frustrated by our repmat's limitation to 2d, so I support merging this.

@johnmyleswhite
Copy link
Member Author

I'll try to speed this up later today. Are you happy with this interface, @timholy?

I've come to want fld1 and rem1 more over time. I can make a separate pull request for those.

@timholy
Copy link
Member

timholy commented Jul 24, 2013

Yes indeed. I strongly approve of its generality :-).

And to clarify, speed improvements would be nice, but I'd rate them as optional because I frankly don't expect major improvements. (Dahua's suggestions were much more important; this is trying to get the last bit of juice out something that has already been squeezed pretty hard.) I would say removing the overly-restrictive type-qualifier is more important, and the optimizations are optional. Merge at will.

@johnmyleswhite
Copy link
Member Author

Anybody else have comments on this? @StefanKarpinski, @JeffBezanson, @ViralBShah?

@johnmyleswhite
Copy link
Member Author

I'm going to merge this in a bit unless someone says otherwise since it seems to not step on most people's toes and helps the rest of us a lot.

@StefanKarpinski
Copy link
Member

Go for it, although I may tinker with the fld1 function.

johnmyleswhite added a commit that referenced this pull request Aug 7, 2013
Add generalized repmat function as repeat
@johnmyleswhite johnmyleswhite merged commit b8c462d into master Aug 7, 2013
@johnmyleswhite johnmyleswhite deleted the jmw/repeater branch August 7, 2013 15:48
@bjarthur
Copy link
Contributor

given repeat, should repmat be deprecated? the latter cannot handle more than 2 dimensions, and does not have the flexibility provided by the former's inner argument.

KristofferC pushed a commit that referenced this pull request Sep 5, 2023
Stdlib: Pkg
URL: https://github.com/JuliaLang/Pkg.jl.git
Stdlib branch: master
Julia branch: master
Old commit: 047734e4c
New commit: f570abd39
Julia version: 1.11.0-DEV
Pkg version: 1.11.0
Bump invoked by: @KristofferC
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
JuliaLang/Pkg.jl@047734e...f570abd

```
$ git log --oneline 047734e4c..f570abd39
f570abd39 tweak how Pkg is loaded for precompiling when testing (#3606)
d3bd38b90 sort compat entries in `pkg> compat` (#3605)
5e07cfed0 Ensure that `string(::VersionSpec)` is compatible with `semver_spec()` (#3580)
6bed7c41a Clarify handling of LOAD_PATH in docstring and REPL help (#3589)
5261b3be7 tweak test dep docs (#3574)
72b430d50 status: expand 2 symbol upgrade note to highlight *may* be upgradable (#3576)
b32db473d precompile: show ext parent in output report (#3603)
```

Co-authored-by: Dilum Aluthge <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants