Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error from StatsBase that only occurs inside Pluto #1958

Closed
yha opened this issue Feb 27, 2022 · 4 comments
Closed

Error from StatsBase that only occurs inside Pluto #1958

yha opened this issue Feb 27, 2022 · 4 comments
Labels
other packages Integration with other Julia packages

Comments

@yha
Copy link
Contributor

yha commented Feb 27, 2022

The following simple notebook generates an error, apparently from StatsBase, which is not reproduced when the notebook is run manually from the REPL in a clean environment.

### A Pluto.jl notebook ###
# v0.18.1

using Markdown
using InteractiveUtils

# ╔═╡ a5fbc918-84ce-468f-b254-2e8eaac24805
begin
	import Pkg
	Pkg.activate(mktempdir())
    Pkg.add([
        Pkg.PackageSpec(name="StatsBase", version="0.33.16"),
        Pkg.PackageSpec(name="HypothesisTests", version="0.10.6"),
    ])
	using StatsBase, HypothesisTests
end

# ╔═╡ 88cb7974-eda1-4294-a852-a9bba4a1263e
x = [rand(1000) for _=1:10]

# ╔═╡ 14947510-97d3-11ec-1e7d-9182a6e65d1c
P = pairwise(eachindex(x); symmetric=true) do j,i
	pval = pvalue(SignedRankTest(x[i], x[j]))
	(pval, pval)
end

# ╔═╡ Cell order:
# ╠═a5fbc918-84ce-468f-b254-2e8eaac24805
# ╠═88cb7974-eda1-4294-a852-a9bba4a1263e
# ╠═14947510-97d3-11ec-1e7d-9182a6e65d1c

image

To test this cleanly, I tried running Pluto in a new temporary environment, and also running the notebook directly in a new temporary environment. When it's run from the REPL, it works fine:

julia> cd()

(@v1.7) pkg> activate --temp
  Activating new project at `C:\Users\sternlab\AppData\Local\Temp\jl_Fc9lvW`

julia> include(".julia/pluto_notebooks/pairwise-notebook-pkg.jl")
  Activating new project at `C:\Users\sternlab\AppData\Local\Temp\jl_0HjFMr`
   Resolving package versions...
    Updating `C:\Users\sternlab\AppData\Local\Temp\jl_0HjFMr\Project.toml`
  [09f84164] + HypothesisTests v0.10.6
  [2913bbd2] + StatsBase v0.33.16
    Updating `C:\Users\sternlab\AppData\Local\Temp\jl_0HjFMr\Manifest.toml`
  [49dc2e85] + Calculus v0.5.1
  [d360d2e6] + ChainRulesCore v1.13.0
  [9e997f8a] + ChangesOfVariables v0.1.2
  [861a8166] + Combinatorics v1.0.2
  [38540f10] + CommonSolve v0.2.0
  [34da2185] + Compat v3.41.0
  [187b0558] + ConstructionBase v1.3.0
  [9a962f9c] + DataAPI v1.9.0
  [864edb3b] + DataStructures v0.18.11
  [b429d917] + DensityInterface v0.4.0
  [31c24e10] + Distributions v0.25.49
  [ffbed154] + DocStringExtensions v0.8.6
  [fa6b7ba4] + DualNumbers v0.6.6
  [1a297f60] + FillArrays v0.13.0
  [34004b35] + HypergeometricFunctions v0.3.8
  [09f84164] + HypothesisTests v0.10.6
  [3587e190] + InverseFunctions v0.1.2
  [92d709cd] + IrrationalConstants v0.1.1
  [692b3bcd] + JLLWrappers v1.4.1
  [2ab3a3ac] + LogExpFunctions v0.3.6
  [1914dd2f] + MacroTools v0.5.9
  [e1d29d7a] + Missings v1.0.2
  [77ba4419] + NaNMath v0.3.7
  [bac558e1] + OrderedCollections v1.4.1
  [90014a1f] + PDMats v0.11.6
  [21216c6a] + Preferences v1.2.4
  [1fd47b50] + QuadGK v2.4.2
  [189a3867] + Reexport v1.2.2
  [ae029012] + Requires v1.3.0
  [79098fc4] + Rmath v0.7.0
  [f2b01f46] + Roots v1.3.14
  [efcf1570] + Setfield v0.8.2
  [a2af1166] + SortingAlgorithms v1.0.1
  [276daf66] + SpecialFunctions v2.1.4
  [82ae8749] + StatsAPI v1.2.1
  [2913bbd2] + StatsBase v0.33.16
  [4c63d2b9] + StatsFuns v0.9.16
  [efe28fd5] + OpenSpecFun_jll v0.5.5+0
  [f50d1b31] + Rmath_jll v0.3.0+0
  [0dad84c5] + ArgTools
  [56f22d72] + Artifacts
  [2a0f44e3] + Base64
  [ade2ca70] + Dates
  [8bb1440f] + DelimitedFiles
  [8ba89e20] + Distributed
  [f43a241f] + Downloads
  [9fa8497b] + Future
  [b77e0a4c] + InteractiveUtils
  [b27032c2] + LibCURL
  [76f85450] + LibGit2
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [a63ad114] + Mmap
  [ca575930] + NetworkOptions
  [44cfe95a] + Pkg
  [de0858da] + Printf
  [3fa0cd96] + REPL
  [9a3f8284] + Random
  [ea8e919c] + SHA
  [9e88b42a] + Serialization
  [1a1011a3] + SharedArrays
  [6462fe0b] + Sockets
  [2f01184e] + SparseArrays
  [10745b16] + Statistics
  [4607b0f0] + SuiteSparse
  [fa267f1f] + TOML
  [a4e569a6] + Tar
  [8dfed614] + Test
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode
  [e66e0078] + CompilerSupportLibraries_jll
  [deac9b47] + LibCURL_jll
  [29816b5a] + LibSSH2_jll
  [c8ffd9c3] + MbedTLS_jll
  [14a3606d] + MozillaCACerts_jll
  [4536629a] + OpenBLAS_jll
  [05823500] + OpenLibm_jll
  [83775a58] + Zlib_jll
  [8e850b90] + libblastrampoline_jll
  [8e850ede] + nghttp2_jll
  [3f19e933] + p7zip_jll
10×10 Matrix{Tuple{Real, Real}}:
 (1, 1)                (0.246841, 0.246841)    (0.113274, 0.113274)    (0.205988, 0.205988)    …  (0.310006, 0.310006)    (0.209662, 0.209662)    (0.365882, 0.365882)    (0.450699, 0.450699)
 (0.246841, 0.246841)  (1, 1)                  (0.580073, 0.580073)    (0.695352, 0.695352)       (0.974021, 0.974021)    (0.965643, 0.965643)    (0.661773, 0.661773)    (0.0297561, 0.0297561)
 (0.113274, 0.113274)  (0.580073, 0.580073)    (1, 1)                  (0.838281, 0.838281)       (0.518781, 0.518781)    (0.395065, 0.395065)    (0.419029, 0.419029)    (0.0111193, 0.0111193)
 (0.205988, 0.205988)  (0.695352, 0.695352)    (0.838281, 0.838281)    (1, 1)                     (0.74415, 0.74415)      (0.877297, 0.877297)    (0.544634, 0.544634)    (0.0114328, 0.0114328)
 (0.23305, 0.23305)    (0.961281, 0.961281)    (0.410636, 0.410636)    (0.656464, 0.656464)       (0.763022, 0.763022)    (0.95317, 0.95317)      (0.917916, 0.917916)    (0.103205, 0.103205)
 (0.678443, 0.678443)  (0.682696, 0.682696)    (0.209027, 0.209027)    (0.319384, 0.319384)    …  (0.579624, 0.579624)    (0.476059, 0.476059)    (0.918959, 0.918959)    (0.174345, 0.174345)
 (0.310006, 0.310006)  (0.974021, 0.974021)    (0.518781, 0.518781)    (0.74415, 0.74415)         (1, 1)                  (0.816534, 0.816534)    (0.653066, 0.653066)    (0.0340973, 0.0340973)
 (0.209662, 0.209662)  (0.965643, 0.965643)    (0.395065, 0.395065)    (0.877297, 0.877297)       (0.816534, 0.816534)    (1, 1)                  (0.808212, 0.808212)    (0.0926432, 0.0926432)
 (0.365882, 0.365882)  (0.661773, 0.661773)    (0.419029, 0.419029)    (0.544634, 0.544634)       (0.653066, 0.653066)    (0.808212, 0.808212)    (1, 1)                  (0.0638147, 0.0638147)
 (0.450699, 0.450699)  (0.0297561, 0.0297561)  (0.0111193, 0.0111193)  (0.0114328, 0.0114328)     (0.0340973, 0.0340973)  (0.0926432, 0.0926432)  (0.0638147, 0.0638147)  (1, 1)
@Pangoraw
Copy link
Collaborator

Pangoraw commented Feb 27, 2022

Thanks for reporting !

This is due to the automatical function wrapping (see #720). You can workaround the problem by defining a dummy function inside the same cell (f() = 1). The error itself is interesting when reproduced outside pluto because it depends on whether or not the variable is a parameter or a global?

Reproducer outside Pluto:

function test(x)
	pairwise(eachindex(x); symmetric=true) do j,i
		pval = pvalue(SignedRankTest(x[i], x[j]))
		(pval, pval)
	end
end

x = [rand(1000) for _=1:10]

test(x) # Fails

Interestingly, using a global variable does not trigger the bug:

function test2()
	pairwise(eachindex(x); symmetric=true) do j,i
		pval = pvalue(SignedRankTest(x[i], x[j]))
		(pval, pval)
	end
end

test2() # works

This seems related to the use of Core.Compiler.return_type in StatsBase/src/pairwise.jl.

@Pangoraw Pangoraw added the other packages Integration with other Julia packages label Feb 27, 2022
@yha
Copy link
Contributor Author

yha commented Feb 27, 2022

This is due to the automatical function wrapping (see #720).

Interesting. Am I right in thinking this is not currently documented? Maybe this should be mentioned around the docs about environments (since it likewise relates to reproducibility).

StatsBase issue: JuliaStats/StatsBase.jl#771

@fonsp fonsp closed this as completed Mar 5, 2022
@fonsp
Copy link
Owner

fonsp commented Mar 5, 2022

We should document this. It's not really a reproducibility problem, but more a problem where code runs differently inside Pluto than outside of it (which we also want to avoid).

@yha
Copy link
Contributor Author

yha commented Mar 5, 2022

It's not really a reproducibility problem

I was thinking or "reproducibility" in the sense of reproducing a problem outside of Pluto, so to me it seemed natural to look in the docs under this heading when trying to figure out why there might be a difference. (but I can see why you might want to call it something different).
EDIT: also, the docs on environment deal with some related stuff, like how to have the same environment in the notebook as some project which is possibly developed outside Pluto.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
other packages Integration with other Julia packages
Projects
None yet
Development

No branches or pull requests

3 participants