Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[Discussion] 1.5.1 Patch Release #15613

Closed
samskalicky opened this issue Jul 19, 2019 · 62 comments
Closed

[Discussion] 1.5.1 Patch Release #15613

samskalicky opened this issue Jul 19, 2019 · 62 comments
Assignees

Comments

@samskalicky
Copy link
Contributor

Let's start a discussion here about the known issues with 1.5.0 to put into a patch release.

Create (or locate existing) issue/pull request for the item, note the issue/pull request number.
Comment in this issue: 1) the above issue number, 2) one sentence of what the item is about and why it's important.
Indicate whether you'd be willing to help out on the item.
Share the ETA if you're driving the item and have an guesstimate on when it will be done.

cc @apache/mxnet-committers

@samskalicky
Copy link
Contributor Author

Theres a problem with the WarpCTC plugin in 1.5.0 #15612. A PR has been created with a fix #15601

@ThomasDelteil
Copy link
Contributor

@roywei
Copy link
Member

roywei commented Jul 19, 2019

Possible issue with asnumpy(), but not able to reproduce yet #15431

@roywei
Copy link
Member

roywei commented Jul 19, 2019

license issues need to be fixed before next release #15542
MKL-DNN can be resolved in 1.6.0, Cub issue depends on Nvidia to release new version of Cub, other issue should be easy to resolve in 1.5.1.

@roywei
Copy link
Member

roywei commented Jul 19, 2019

FYI #15281 (comment) is fixed in 1.5.0

@roywei
Copy link
Member

roywei commented Jul 19, 2019

FYI #15337 is also resolved

@keerthanvasist
Copy link

@mxnet-label-bot add [discussion]

@iblislin
Copy link
Member

iblislin commented Jul 20, 2019

I will track Julia stuffs that need to be backported here:

@iblislin
Copy link
Member

Hi, I just created a label "Backport 1.5" (https://github.com/apache/incubator-mxnet/labels/Backport%201.5).
I think we can create a single PR to collecting the stuffs that need to be backported, use this label as a checklist, and keep that PR open until we want to make a release candidate.

@marcoabreu
Copy link
Contributor

I'd prefer if we use multiple PRs which only contain cherry-picks from master. That way, we don't squash and keep the references.

@iblislin
Copy link
Member

ah, okay

@szha
Copy link
Member

szha commented Jul 26, 2019

@samskalicky thanks for starting the discussion. @PatricZhao @TaoLv could you drive this release?

@pengzhao-intel
Copy link
Contributor

It's great to lead the release. @TaoLv @juliusshufan will be working on this.

@roywei
Copy link
Member

roywei commented Jul 31, 2019

Update: moved this to 1.6.0 scope

nightly test failure need to be fixed:
#15374

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/NightlyTestsForBinaries/detail/master/395/pipeline/

@sandeep-krishnamurthy
Copy link
Contributor

#15672

@samskalicky
Copy link
Contributor Author

samskalicky commented Aug 5, 2019

@TaoLv Lets include #15692 in the 1.5.1 release to fix #15737

@TaoLv
Copy link
Member

TaoLv commented Aug 6, 2019

@TaoLv Lets include #15692 in the 1.5.1 release to fix #15737

Thanks for reporting. Added it to https://cwiki.apache.org/confluence/display/MXNET/1.5.1+Release+Plan+and+Status

@wkcn
Copy link
Member

wkcn commented Aug 6, 2019

Hi, I’m fixing the bug of two C APIs MXEnginePushAsyncND and MXEnginePushSyncND
#15751

It will be finished this week.

@TaoLv
Copy link
Member

TaoLv commented Aug 7, 2019

Thanks for proposing, @wkcn .
Is this bug firstly introduced in the 1.5.0 release? Why do you think it should be included into the 1.5.1 patch release?

@wkcn
Copy link
Member

wkcn commented Aug 7, 2019

@TaoLv Yes, it is a bug in 1.5.0 release https://github.com/apache/incubator-mxnet/blob/v1.5.x/src/c_api/c_api.cc#L1537

It need to be fixed, otherwise users couldn't use these two C APIs in the release version of MXNet.

@TaoLv
Copy link
Member

TaoLv commented Aug 7, 2019

I mean does it also exist in v1.4.x releases?

@wkcn
Copy link
Member

wkcn commented Aug 7, 2019

@TaoLv It doesn't exist in v1.4.x release. The bug was introdueced firstly in v1.5.x release.

@TaoLv
Copy link
Member

TaoLv commented Aug 7, 2019

Thanks for the clarification. Do you have a github issue for that? I add it to the track list.

@wkcn
Copy link
Member

wkcn commented Aug 7, 2019

@TaoLv I have added the issue #15774

@TaoLv
Copy link
Member

TaoLv commented Aug 7, 2019

@wkcn, thanks for your report and contribution! The issue and your fix are added to https://cwiki.apache.org/confluence/display/MXNET/1.5.1+Release+Plan+and+Status.

@KellenSunderland
Copy link
Contributor

What's the issue with MShadow? Did they have to do a force push / history re-write again? We may indeed have to backport that to other release branches.

@TaoLv
Copy link
Member

TaoLv commented Aug 8, 2019

@KellenSunderland mshadow source code is donated and merged to MXNet. See #15600 and dmlc/mshadow#373. So it's not a git submodule any more.

@wkcn
Copy link
Member

wkcn commented Aug 8, 2019

For the source downloaded before mshadow merged into MXNet, users need to remove
the directory mshadow firstly, then git pull to update the code.

@samskalicky
Copy link
Contributor Author

@wkcn that doesnt work for me:

ubuntu@ip-172-31-76-214:~$ git clone --recursive https://github.com/apache/incubator-mxnet.git mxnet2
Cloning into 'mxnet2'...

ubuntu@ip-172-31-76-214:~/mxnet2$ git checkout v1.5.x
M	3rdparty/dlpack
M	3rdparty/dmlc-core
M	3rdparty/mkldnn
M	3rdparty/tvm
Branch v1.5.x set up to track remote branch v1.5.x from origin.
Switched to a new branch 'v1.5.x'
ubuntu@ip-172-31-76-214:~/mxnet2$ cd 3rdparty/
ubuntu@ip-172-31-76-214:~/mxnet2/3rdparty$ ls
ctc_include  dlpack  dmlc-core  googletest  mkldnn  mshadow  nvidia_cub  onnx-tensorrt  openmp  ps-lite  tvm
ubuntu@ip-172-31-76-214:~/mxnet2/3rdparty$ ls mshadow/
ubuntu@ip-172-31-76-214:~/mxnet2/3rdparty$ rm -rf mshadow/
ubuntu@ip-172-31-76-214:~/mxnet2/3rdparty$ git pull
Already up-to-date.
ubuntu@ip-172-31-76-214:~/mxnet2/3rdparty$ ls
ctc_include  dlpack  dmlc-core  googletest  mkldnn  nvidia_cub  onnx-tensorrt  openmp  ps-lite  tvm
ubuntu@ip-172-31-76-214:~/mxnet2/3rdparty$ git submodule update --recursive
Submodule path 'dlpack': checked out '10892ac964f1af7c81aae145cd3fab78bbccd297'
Submodule path 'dmlc-core': checked out '3943914eed66470bd010df581e29e4dca4f7df6f'
Submodule path 'mkldnn': checked out '41bee20d7eb4a67feeeeb8d597b3598994eb1959'
Submodule path 'tvm': checked out '21935dcbf56ad3bd66ebff9891a6bc3865b8106d'
Submodule path 'tvm/3rdparty/dlpack': checked out '5c792cef3aee54ad8b7000111c9dc1797f327b59'

Submodule path 'tvm/3rdparty/dmlc-core': checked out '82bf4c2e2af312b3d52513aa727483803a2f8734'
ubuntu@ip-172-31-76-214:~/mxnet2/3rdparty$ 
ubuntu@ip-172-31-76-214:~/mxnet2/3rdparty$ ls
ctc_include  dlpack  dmlc-core  googletest  mkldnn  nvidia_cub  onnx-tensorrt  openmp  ps-lite  tvm
ubuntu@ip-172-31-76-214:~/mxnet2/3rdparty$ 

@wkcn
Copy link
Member

wkcn commented Aug 8, 2019

Hi @samskalicky , I can download mshadow by git submodule update --init --recursive

@TaoLv
Copy link
Member

TaoLv commented Aug 9, 2019

@KellenSunderland Could you please include the TensorRT patches you just mentioned into the cwiki? https://cwiki.apache.org/confluence/display/MXNET/1.5.1+Release+Plan+and+Status

Thanks for your support!

@TaoLv
Copy link
Member

TaoLv commented Aug 12, 2019

@iblis17 Could you help to pick #15608 #15609 to the v1.5.x branch?

@KellenSunderland
Copy link
Contributor

KellenSunderland commented Aug 13, 2019

For some reason not able to edit the cwiki?

Proposals for fixes are:
#15645
#15666
#14898
#15344
#14860

@TaoLv
Copy link
Member

TaoLv commented Aug 13, 2019

Thank you @KellenSunderland . I just included these patches to the table on cwiki. Please help take a look: https://cwiki.apache.org/confluence/display/MXNET/1.5.1+Release+Plan+and+Status
BTW, do you know someone with TensorRT background can help to pick them to the v1.5.x branch? Because I'm wondering if they need be ported in a certain order.

@KellenSunderland
Copy link
Contributor

No problem Tao, I can probably cherry-pick them a bit later today.

@KellenSunderland
Copy link
Contributor

KellenSunderland commented Aug 13, 2019

First three cherry-picked PRs:
#15875
#15874
#15877

@ChaiBapchya
Copy link
Contributor

Benchmark doc fix PR #15769

@samskalicky
Copy link
Contributor Author

@TaoLv I'd like to pull in the following PRs, they are necessary fixes for some of my use-cases:
#15245
#15917

@apeforest
Copy link
Contributor

This PR fixes a memory misalignment bug in topk operator introduced recently. Please add it to 1.5.1 patch release:
#15948

Thanks

@TaoLv
Copy link
Member

TaoLv commented Aug 28, 2019

First three cherry-picked PRs:
#15875
#15874
#15877

@KellenSunderland, the 3 PRs were already merged to v1.5.x. Please update for the other two if we still need them.

@TaoLv
Copy link
Member

TaoLv commented Aug 28, 2019

Benchmark doc fix PR #15769

@ChaiBapchya, thank you for the report. Could you please cherry pick it to the v1.5.x branch and tag me in the PR?

@TaoLv
Copy link
Member

TaoLv commented Aug 28, 2019

@TaoLv I'd like to pull in the following PRs, they are necessary fixes for some of my use-cases:
#15245
#15917

@samskalicky , please pick them to v1.5.x and tag me in the PR. I will update the cwiki page accordingly.

@TaoLv
Copy link
Member

TaoLv commented Aug 28, 2019

This PR fixes a memory misalignment bug in topk operator introduced recently. Please add it to 1.5.1 patch release:
#15948

Thanks

@apeforest Any update? Is it still true that we can pick this to v1.5.x?

@KellenSunderland
Copy link
Contributor

KellenSunderland commented Aug 30, 2019

Hey @TaoLv . Updated the last two:

#16043
#16044

@iblislin
Copy link
Member

@iblis17 Could you help to pick #15608 #15609 to the v1.5.x branch?

@TaoLv Just can I just push to the v1.5.x branch direcetly?

@marcoabreu
Copy link
Contributor

No, please create pull requests instead

@apeforest
Copy link
Contributor

apeforest commented Sep 11, 2019 via email

@TaoLv
Copy link
Member

TaoLv commented Sep 11, 2019

Yes, we have passed code freeze and the rc0 was already tagged. Front-end packaging is in progress. One can still commit code change to the v1.5.x branch but that will be not included in the vote for rc0.

@lostella
Copy link
Contributor

1.5.0 apparently broke sampling operators on Linux, see #16135. A PR is open for this #16139. I hope this has a chance to make it to 1.5.1.

@eric-haibin-lin
Copy link
Member

If we have a chance for another RC, we should port #15240 and #15342

@iblislin
Copy link
Member

iblislin commented Sep 22, 2019

No idea why my backporting PR got closed (#16142). reopened

@strawberrypie
Copy link

Any news on 1.5.1?

@eric-haibin-lin
Copy link
Member

released: https://github.com/apache/incubator-mxnet/releases/tag/1.5.1

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests