StatPlots

Primary author: Thomas Breloff (@tbreloff)

This package contains many statistical recipes for concepts and types introduced in the JuliaStats organization, intended to be used with Plots.jl:

Types:
- DataFrames
- Distributions
Recipes:
- histogram/histogram2d
- boxplot
- violin
- marginalhist
- corrplot/cornerplot

Initialize:

#Pkg.clone("[email protected]:JuliaPlots/StatPlots.jl.git")
using StatPlots
gr(size=(400,300))

The DataFrames support allows passing DataFrame columns as symbols. Operations on DataFrame column can be specified using quoted expressions, e.g.

using DataFrames
df = DataFrame(a = 1:100, b = randn(100), c = abs(randn(100)))
plot(df, :a, [:b :c])
scatter(df, :a, :b, markersize = :(4 * log(:c + 0.1)))

If you find an operation not supported by DataFrames, please open an issue. An alternative approach to the StatPlots syntax is to use the DataFramesMeta macro @with. Symbols not referring to DataFrame columns must be escaped by ^() e.g.

using DataFramesMeta
@with(df, plot(:a, [:b :c], colour = ^([:red :blue])))

marginalhist with DataFrames

using RDatasets
iris = dataset("datasets","iris")
marginalhist(iris, :PetalLength, :PetalWidth)

corrplot and cornerplot

M = randn(1000,4)
M[:,2] += 0.8sqrt(abs(M[:,1])) - 0.5M[:,3] + 5
M[:,3] -= 0.7M[:,1].^2 + 2
corrplot(M, label = ["x$i" for i=1:4])

cornerplot(M)

cornerplot(M, compact=true)

boxplot and violin

import RDatasets
singers = RDatasets.dataset("lattice","singer")
violin(singers,:VoicePart,:Height,marker=(0.2,:blue,stroke(0)))
boxplot!(singers,:VoicePart,:Height,marker=(0.3,:orange,stroke(2)))

using Distributions
plot(Normal(3,5), fill=(0, .5,:orange))

dist = Gamma(2)
scatter(dist, leg=false)
bar!(dist, func=cdf, alpha=0.3)

Grouped Bar plots

groupedbar(rand(10,3), bar_position = :stack, bar_width=0.7)

This is the default:

groupedbar(rand(10,3), bar_position = :dodge, bar_width=0.7)

groupapply for population analysis

There is a groupapply function that splits the data across a keyword argument "group", then applies "summarize" to get average and variability of a given analysis (density, cumulative and local regression are supported so far, but one can also add their own function). To get average and variability there are 3 ways:

compute_error = (:across, col_name), where the data is split according to column col_name before being summarized. compute_error = :across splits across all observations. Default summary is (mean, sem) but it can be changed with keyword summarize to any pair of functions.
compute_error = (:bootstrap, n_samples), where n_samples fake datasets distributed like the real dataset are generated and then summarized (nonparametric bootstrapping). compute_error = :bootstrap defaults to compute_error = (:bootstrap, 1000). Default summary is (mean, std). This method will work with any analysis but is computationally very expensive.
compute_error = :none, where no error is computed or displayed and the analysis is carried out normally.

The local regression uses Loess.jl and the density plot uses KernelDensity.jl. In case of categorical x variable, these function are computed by splitting the data across the x variable and then computing the density/average per bin. The choice of continuous or discrete axis can be forced via axis_type = :continuous or axis_type = :discrete

Example use:

using DataFrames
import RDatasets
using StatPlots
gr()
school = RDatasets.dataset("mlmRev","Hsb82");
grp_error = groupapply(:cumulative, school, :MAch; compute_error = (:across,:School), group = :Sx)
plot(grp_error, line = :path)

Keywords for loess or kerneldensity can be given to groupapply:

df = groupapply(:density, school, :CSES; bandwidth = 1., compute_error = (:bootstrap,500), group = :Minrty)
plot(df, line = :path)

The bar plot

pool!(school, :Sx)
grp_error = groupapply(school, :Sx, :MAch; compute_error = :across, group = :Minrty)
plot(grp_error, line = :bar)

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
src		src
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE.md		LICENSE.md
README.md		README.md
REQUIRE		REQUIRE
appveyor.yml		appveyor.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StatPlots

Primary author: Thomas Breloff (@tbreloff)

marginalhist with DataFrames

corrplot and cornerplot

boxplot and violin

Grouped Bar plots

groupapply for population analysis

About

Releases

Packages

Languages

License

ti-s/StatPlots.jl

Folders and files

Latest commit

History

Repository files navigation

StatPlots

Primary author: Thomas Breloff (@tbreloff)

marginalhist with DataFrames

corrplot and cornerplot

boxplot and violin

Grouped Bar plots

groupapply for population analysis

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages