Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Act #1

Merged
merged 37 commits into from
Jul 31, 2018
Merged

Act #1

merged 37 commits into from
Jul 31, 2018

Conversation

ZhennanQin
Copy link

Description

(Brief description on what this PR is about)

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

OneRaynyDay and others added 30 commits July 23, 2018 22:58
* Fix quantization bug

* Added tests and made sure the edge case is now considered correctly without 1 off errors

* Changed back to original truncated distribution but with different kl divergence calc

* Reorder back to original format

* Reorder back to original format (again)

* Change comments

* Clarified comments

* Changed norm division
* add flakiness checker

* fixed style and argument parsing

* added verbosity option, further documentation, etc.

* added logging

* Added check for invalid argument

* updated error message for specificity

* fixed help message
* [MXAPPS-581] Nightly Straight Dope tests.

The Straight Dope notebooks will retrieved from the Github repo, run and
scanned for warnings and errors. Because we are not checking accuracy of
the training, we set the number of epochs to 1 to reduce the integration
test run time.
* Common functionality for running and testing notebooks has been
  factored into a common test util module.
* Support for running UTF-8 notebooks added (Python2 and 3 compatible).
* Notebooks requiring a single GPU and multi GPUs have been split
  into two different test suites so that they can be run on different
  hardware.
* Add test to make sure that all notebooks are tested.
* Comment out broken notebooks while they are being fixed (I will
  uncomment them in a follow up PR).

* [MXAPPS-581] Download notebooks in test setup.

* Moving logic to download the Straight Dope notebooks to the test
harness.
* Remove cache logic as it is unnecessary.

* [MXAPPS-581] Add a timeout for download of notebooks.

* [MXAPPS-581] Move notebooks requiring multi-gpus.

Move two notebooks requiring multi-GPUs out of the single GPU test suite.
…pdated) (apache#11591)

* add multiroot all-reduce communication pattern

* fix bug with UpdateWeight

* fix PCI-E links appearing in weight matrix bug

* optimization to skip CopyFromTo in ReduceInner gains a bit of throughput

* remove unnecessary if statement

* Add tests

* add more tests, 6 tests left to add

* get rid of some dead code

* Add comments

* Add randomized tests for backtrack and kernighan-lin

* Fix Postprocess

* Add switch for first valid tree when num_gpus > 8, and for maximum weight when num_gpus <= 8

* Kernighan-Lin seems to find better trees

* get rid of printfs

* change defaults

* inherit from CommDevice instead of Comm

* Fix lint errors

* Add Python test using MXNET_KVSTORE_USETREE, fix CMake compilation problem, add header guard

* fix lint errors

* better header guard that works for tests

* get rid of unused variable warning

* retrigger jenkins

* resolve 2 comments

* address comment using Class to do test, get rid of extraneous test, use PCI-E as fallback for GPUs that are not linked by NVLink

* address comments

* fix a few bugs

* get rid of printfs

* get rid of print

* Comment out test for now

* fix 2 more bugs

* fix segfault

* change PrintVector, PrintTopo, PrintMatrix to LOG(INFO) instead of stdout

* Fix code alignment

* get rid of todo

* Make changes to env variable names to indicate they are TREE-related

* Add note saying when ARRAY_BOUND env var takes effect
* Fix file name creation for Windows

* Forcing build

* Force build again
* update vgg pretrained model

* Trigger CI

* Trigger CI
* Add verify_ssl option to gluon.utils.download

Sometimes datasets may be hosted on servers that serve invalid SSL certificates.

* Add warning

* Add test

* Mock gluon.utils.download tests

* Add Py2 mock dependency to Jenkinsfile
…e Release & Maven Central Repo (apache#11862)

* pom file changes for maven builds
This enabled retries for Docker build commands executed by our master and PR builds.
* Return if iteration counter `N` is less than or equal to zero.

* Fix spelling.
* refactor R optimizers to fix memory leak

* add Adadelta and Adagrad

* fix comments

* fix comments

* fix comments

* add tests

* fix whitespaces

* fix whitespaces

* fix typo

* fix typo

* add doc on clipping
* Add logistic regression tutorial

* Code review fix

* Add F1 metric, fix code review comments

* Add Download buttons script
* fix undeterminism of dot(csr.T, dns) = dns with tests

* address code reviews
apache#11587)

* [MXNET-378] Adding depth_to_space and space_to_depth operator

* fixed lint and windows CPU errors

* compliance with C++ style guiide and address shortcomings in unittests

* fixed documentation and nitpicky suggestions

* added operator references in API docs and removed inplace optimization support

* Added references in symbol.md and ndarray.md. Improved test cases and added block_size check

* Fixing bugs in documentation. Tests now include tensors of random shapes.
* fix ctc_loss GPU bug

* add blank_label parameter for CTCLoss

* Revert "add blank_label parameter for CTCLoss"

This reverts commit aab11f7.
* add more ops

* use dict.get

* add list comprehensive

* retrigger CI due to unrelated flaky test failure
* Replace cublassgemm with cublassgemmex for >= 7.5

* Add comment for cublassgemmex
* Remove fixed seed for test_sparse_nd_save_load

* Add comments related to the commit
Corrected a race condition with stopping profiling. Added mx.nd.waitall to ensure all operations have completed, including GPU operations that might otherwise be missing.

Also added alternative code for context selection GPU vs CPU, that had error before on machines with nvidia-smi.
)

* fix bugs and improve tutorial

* improve logging

* update benchmark_score

* Update float16.md

* update link to dmlc web data

* fix train cifar and add random mirroring

* set aug defaults

* fix whitespace

* fix typo
* adding param for list of tags to display on website

* using new website display argument for artifact placement in version folder

* adding display logic

* remove restricted setting for testing

* update usage instructions

* reverted Jenkinsfile to use restricted nodes
* Update relative paths pointing to the data directory to point to the
  correct place in the testing temporary folder.

* Enable the notebooks that were previously broken because of relative
  file paths not pointing to the correct place.

* Move some notebooks we do not plan to test to the whitelist. These
  notebooks are not published in the Straight Dope book.

* Clean-up: Convert print statements to info/warn/error logging
  statements. Add some logging statements for better status.
* add linux and macos doc

* update doc

* Update MKL_README.md

* Update MKL_README.md

Add convolution code to verify mkldnn backend

* add homebrew link

* rename to MKLDNN_README

* add mkl verify

* trigger

* trigger

* set mac complier to gcc47

* add VS2017 support experimentally

* improve quality

* improve quality

* modify mac build instruction since prepare_mkldnn.sh has been rm

* trigger

* add some improvement
* add changes to example

* place the file to the util

* add retry scheme

* fix the retry logic

* change the DownloadUtil to Util

* Trigger the CI
DickJC123 and others added 6 commits July 30, 2018 13:34
…req='add' (apache#11338)

* Add tests that fail due to issue 11241

* Fix apache#11241 Conv1D throws CUDNN_STATUS_EXECUTION_FAILED

* Force algo 1 when grad_req==add with large c.  Expand tests.

* Shorten test runtimes.
…ning with Gluon (apache#11910)

* Add description about update on kvstore

* add async check for gluon

* only raise error if user set update_on_kvstore

* fix condition

* add async nightly test

* fix case when no kvstore

* add example for trainer creation in doc
* fix R windows install docs

* addressed PR comments

* PR comments

* PR comments

* fixed line wrappings

* fixed line wrappings
@luobao-intel luobao-intel merged commit b9729a4 into luobao-intel:fallback Jul 31, 2018
@ZhennanQin ZhennanQin deleted the act branch October 10, 2018 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.