Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement rand(::Range{BigInt}) #9122

Merged
merged 1 commit into from
Dec 2, 2014
Merged

Implement rand(::Range{BigInt}) #9122

merged 1 commit into from
Dec 2, 2014

Conversation

rfourquet
Copy link
Member

This is a rebase of #8255, with minor modifications.
The travis test doesn't pass on linux because functions only available in GMP 6 are used, but isn't version 6 a requirement now? (cf. #7957).

@rfourquet rfourquet added the randomness Random number generation and the Random stdlib label Nov 23, 2014
@ivarne
Copy link
Member

ivarne commented Nov 23, 2014

No, currently we require GMP (>= 5.0) in the README. Not sure about the situation in other parts of Julia, but from the discussion in #7957 it seems likely that we can bump the requirement. Packagers on Linux systems seem to have problems with updating dependencies, so we should probably be carefull.

@rfourquet
Copy link
Member Author

OK, so GMP 6 may not be a requirement for Julia 0.4 in the end? in this case I would see if I can implement this using only GMP 5.

@nalimilan
Copy link
Member

If you need it I guess we'll deal with that dependency when building packages, but if that's not really needed it would be easier for packagers not to use it. :-)

@ViralBShah
Copy link
Member

I think that given julia 0.4 is going to release after a few months, we can go with GMP 6 as a dependency. All the distros will have it by then.

@ivarne ivarne changed the title Implement rand(::Range{BigInt}) WIP: Implement rand(::Range{BigInt}) Nov 23, 2014
@ivarne
Copy link
Member

ivarne commented Nov 23, 2014

I changed the title to WIP, so that it is clear that it can't be merged before we update GMP requirement.

@ViralBShah
Copy link
Member

@nalimilan Would it be crazy to require GMP 6 for julia 0.4?

@nalimilan
Copy link
Member

Absolutely not, it's already in the upcoming Fedora 21 (but generally speaking, raising requirements should only happen with relatively strong reasons).

@rfourquet
Copy link
Member Author

So there was a solution in GMP5, which is no less efficient despite having to copy data instead of generating it directly into the BigInt.
Removing WIP as there are no more GMP requirements.

@rfourquet rfourquet changed the title WIP: Implement rand(::Range{BigInt}) Implement rand(::Range{BigInt}) Nov 24, 2014
@rfourquet
Copy link
Member Author

Here is the link to the doc of mpz_import, if someone wanted to check the wrapping code.
The Travis failure seems to be unrelated (it happens in parallel.jl/sharedarray.jl).

@ViralBShah
Copy link
Member

@rfourquet Can you squash these commits?

@ViralBShah
Copy link
Member

Also, in the squashed commits, can you give a slightly more detailed explanation of the commits?

@tkelman
Copy link
Contributor

tkelman commented Nov 29, 2014

opened #9176 for the travis failure - it actually looks like it's in bitarray

@rfourquet rfourquet force-pushed the rf/rand-bigrange branch 2 times, most recently from 3f19fc1 to beff573 Compare November 30, 2014 04:20
@rfourquet
Copy link
Member Author

This is squashed now, with some explanations added in the commit.

@ViralBShah
Copy link
Member

A few points as I look through this and other related code:

  1. Let's have the GMP 6 code in there, but with a version check. If there is no easy way to check the GMP version, let's have it in there, but commented out. Is there a significant reason to use the new API - performance or memory?
  2. The names could be a little more consistent so that the relationships are clearer. How about renaming RangeGenerator to AbstractRangeGenerator, RandIntGen to RangeGeneratorInt, and RandIntGenBigInt to RangeGeneratorBigInt? Could inrange then just be rand_int_gen?
  3. I notice that the range length is an unsigned integer, and we specialize for both UInt32 and UInt64. What if we just use Int, which is essentially what we use in most places? I don't think we need to have every possible combination and the full range - if we can keep things simpler, unless there is a good reason for it.

@rfourquet
Copy link
Member Author

  1. There is a way to get the GMP version, I didn't thought of that. For small ranges, I couldn't detect significative speed differences between the two versions, but on bigger ranges the GMP5 version is visibly slower (e.g. 30% slower for rand(1:big(2)^10000)).
  2. I mostly agree with your changes, but I don't feel the need (yet) of "Abstract" in AbstractRangeGenerator. I was unsure for the name inrange, but it reads nicely in rand(mt, inrange(r)), where r is a range. But when squashing I wanted to change it to rangegen. I would prefer that the name has "range" in it, unlike rand_int_gen.
  3. I started a refactoring to simplify this (a bit similar to rand of AbstractArray #8649). But there was opposition to removing the full range. The UInt32 version could maybe be removed if that doesn't make things slower for small types.

@rfourquet rfourquet force-pushed the rf/rand-bigrange branch 2 times, most recently from 622c530 to e209872 Compare December 1, 2014 14:28
@rfourquet
Copy link
Member Author

Updated:

  1. the code for the two versions of GMP is there. I named a constant GMP_VERSION in gmp.jl, but if I call it simply VERSION , this message is issued during compilation "Warning: imported binding for VERSION overwritten in module GMP"... aren't they in a different namespace?
  2. I kept the name RangeGenerator, and used also RangeGenerator as the constructor (instead of inrange).

@ViralBShah
Copy link
Member

It seems that the Windows 64-bit has failed. The worker running the random test is the one that never came back - but there are no error messages.

https://ci.appveyor.com/project/StefanKarpinski/julia/build/1.0.133/job/yhkll194gvedibs9

@tkelman How do we figure this one out? Can you try this branch?

@ViralBShah
Copy link
Member

@rfourquet The PR looks good otherwise.

@tkelman
Copy link
Contributor

tkelman commented Dec 1, 2014

@ViralBShah sure I'm building this branch now, will let you know what I see

@tkelman
Copy link
Contributor

tkelman commented Dec 1, 2014

This does seem to freeze or go into an infinite loop of some kind:

  | | |_| | | | (_| |  |  Version 0.4.0-dev+1904 (2014-12-01 14:28 UTC)
 _/ |\__'_|_|_|\__'_|  |  rf/rand-bigrange/e209872* (fork: 1 commits, 1 day)
|__/                   |  x86_64-w64-mingw32

julia> T = UInt32;

julia> s = big(typemax(T)-1000) : big(typemax(T)) + 10000
4294966295:4294977295

julia> using Base.Test

julia> @test rand(s) != rand(s)

@tkelman
Copy link
Contributor

tkelman commented Dec 1, 2014

On master at fd531ed that same sequence of inputs gives ERROR: stack overflow

@rfourquet
Copy link
Member Author

Thanks @tkelman. The stack overflow on master is expected, but I don't see anything that could cause the freeze, I believe it happens in the while true in rand(rng::AbstractRNG, g::RangeGeneratorBigInt).
I'm not familiar with endianness, so I may have put the wrong endianness argument to mpz_import: is the processor big endian? Could you show the value of Base.GMP.GMP_VERSION, and of Base.Random.RangeGenerator(s) in the example above?

@tkelman
Copy link
Contributor

tkelman commented Dec 2, 2014

julia> Base.GMP.GMP_VERSION
v"6.0.0"

julia> Base.Random.RangeGenerator(s)
Base.Random.RangeGeneratorBigInt(4294966295,11000,1,0x00003fff)

@tkelman
Copy link
Contributor

tkelman commented Dec 2, 2014

I'm not sure about endianness (I don't think it's that different on Windows?) but I have seen in the past that GMP and MPFR tend to use C longs in many of their API's, and I do know that long is only 32 bits on Win64. But I'll defer to you on whether that would make a difference to this code.

@rfourquet
Copy link
Member Author

The other AppVeyor build seems to succeed on a pentium 4 32-bits machine... I read again the code without finding any bug. Would you accept to make another test after adding prints around line 457:

print(limbs, " ")
ccall((:__gmpz_limbs_finish, :libgmp), Void, (Ptr{BigInt}, Clong), &x, g.nlimbs)
println(x)

and then run

T = UInt32; s = big(typemax(T)-1000) : big(typemax(T)) + 10000;  println(rand(s)); rand(s)

?

@tkelman
Copy link
Contributor

tkelman commented Dec 2, 2014

https://gist.github.com/2e22f8b016fda3d35c76

That was terminated by me with ctrl-C, didn't show any signs of stopping. Make sure you're not assuming Int and Clong are the same anywhere, on Win64 they're different.

@ViralBShah
Copy link
Member

The code seems to use Clong everywhere in ccall, and I assume that Ints passed are converted appropriately by ccall.

@rfourquet
Copy link
Member Author

Thanks a lot! The manual says indeed that ccall converts arguments to what is declared in ccall argument types.
So the first call to rand works (0x00000722 == UInt32(1826)), but then in the second call we find that 0x000006ac == 326417516204 % UInt32 but with 326417516204 = UInt64(0x4c)<<32 + 0x000006ac. So my guess is that for some reason, mpz_limbs_finish assumes that limbs are not of type Culong but UInt64, and that when reading the first limb, it reads beyond the passed array (UInt32[0x000006ac] of size 1), which contains 0x4c after the end position, interpreting those two UInt32 as one UInt64 limb.
In gmp.jl, the data field is Ptr{Culong} so I had assumed that it's enforced that limbs are Culong (default value in gmp.h), but if a local GMP installation is used, it can be different, and I think that this particular installation has configured GMP with #define _LONG_LONG_LIMB. I will see if there is a way to detect the size of the limbs.

@rfourquet
Copy link
Member Author

I pushed a new commit replacing the use of Culong (for limbs) by an auto-detected Limb type. If this fixes the problem above, I'll squash it.

@ViralBShah
Copy link
Member

Since you are squashing, could you also update the comments with a bit about the type of Limb? Should be good to merge after that.

* Previously, calling e.g. rand(big(1:4)) caused a stack overflow,
  because rand(mt::MersenneTwister, r::AbstractArray) was calling
  itself recursively. This is fixed here, but a mecanism handling
  not-implemented types should be added.
* RandIntGen is renamed to RangeGeneratorInt.
* A type RangeGeneratorBigInt similar to RangeGeneratorInt (both
  subtypes of RangeGenerator) is introduced to handle rand on
  BigInt ranges. RangeGenerator is the generic constructor of such
  objects, taking a range as parameter.
  Note: two different versions are implemented depending on the
  GMP version (it is faster with version 6 on big ranges).
* BigInt tests from commit bf8c452 are re-added.
@rfourquet
Copy link
Member Author

@ViralBShah I added some comment above the Limb definition in gmp.jl, is this fine?

@rfourquet
Copy link
Member Author

Referencing squashed commit to record discussion.

ViralBShah added a commit that referenced this pull request Dec 2, 2014
@ViralBShah ViralBShah merged commit 7eb9a14 into master Dec 2, 2014
@ViralBShah
Copy link
Member

Thanks - this got done nicely.

@tkelman
Copy link
Contributor

tkelman commented Dec 2, 2014

Good catch @rfourquet. I don't think we're specifically asking for this configuration item but it's probably buried somewhere in config.log.

@rfourquet rfourquet deleted the rf/rand-bigrange branch December 3, 2014 03:03
@rfourquet
Copy link
Member Author

@tkelman and it was a good configuration as far as tests are concerned!
I fixed the limbs field and put the check suggested in 8b0921c#commitcomment-8799280 at https://github.com/JuliaLang/julia/compare/rf/rand-bigrange-fixup. I thougth it didn't deserve a new PR but wanted someone to check before meging.

rfourquet referenced this pull request Dec 3, 2014
Now the runtime version of GMP is checked against the compile time
version, as suggested by @JeffBezanson. He also noted that the limbs
field of RangeGeneratorBigInt was not specific enough.
@rfourquet rfourquet added the bignums BigInt and BigFloat label May 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bignums BigInt and BigFloat randomness Random number generation and the Random stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants