Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI warmup auto metric #821

Closed
wants to merge 68 commits into from
Closed

MPI warmup auto metric #821

wants to merge 68 commits into from

Conversation

bbbales2
Copy link
Member

@bbbales2 bbbales2 commented Feb 18, 2020

Submisison Checklist

  • Run tests: ./runCmdStanTests.py src/test
  • Declare copyright holder and open-source license: see below

Summary:

This pulls in the auto metric from: #729 . Most of the code is in stan-dev/stan#2886 .

How to Verify:

Can get an operational version of this with:

git clone --recursive --branch mpi_warmup_auto https://github.com/stan-dev/cmdstan.git cmdstan-mpi-auto
cd cmdstan-mpi-auto

My make/local is:

CXXFLAGS += -isystem /usr/include/mpi/
MPI_ADAPTED_WARMUP = 1

Here's an example model that works best with the dense_e metric: https://github.com/bbbales2/cmdstan-warmup/tree/develop/examples/diamonds

Here's an example model that works best with the diag_e metric: https://github.com/bbbales2/cmdstan-warmup/tree/develop/examples/radon

Build the models and such:

make -j8 radon diamonds bin/stansummary

Can run a model and see output with something like (mpich):

mpiexec -n 4 -l ./diamonds sample algorithm=hmc metric=auto_e data file=diamonds.dat && bin/stansummary output.csv.mpi.*

When you're running these things you'll see printouts like:

[1] adapt dense, max: 6.00221
[1] adapt diag, max: 516.659
[3] adapt dense, max: 5.98076
[3] adapt diag, max: 514.808
[2] adapt dense, max: 6.00669
[2] adapt diag, max: 517.094
[0] adapt dense, max: 6.01311
[0] adapt diag, max: 517.663

A lower number there is better. Every chain does the calculation, but chain 0 still broadcasts and overwrites all the other metrics.

Side Effects:

I also changed how file naming works in an earlier commit in this branch:

  1. Went back to output.csv.mpi.x (so that output paths like output=/tmp/output/output would work)
  2. Added mpi naming for save diagnostics
  3. If there's only a single chain running, keep the default naming (nothing added on the end)

This is so the mpi stuff can work with a branch of cmdstanr.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Columbia University

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@bbbales2 bbbales2 requested a review from yizhang-yiz February 18, 2020 18:32
@bbbales2 bbbales2 mentioned this pull request Feb 18, 2020
3 tasks
@bbbales2
Copy link
Member Author

Closing for now. There'll be a more up to date branch somewhere else.

@bbbales2 bbbales2 closed this Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant