Fixes to CUDA problems introduced by recent commits: normalize code c…#1228
Merged
danpovey merged 1 commit intokaldi-asr:masterfrom Nov 30, 2016
Merged
Fixes to CUDA problems introduced by recent commits: normalize code c…#1228danpovey merged 1 commit intokaldi-asr:masterfrom
danpovey merged 1 commit intokaldi-asr:masterfrom
Conversation
…de crash; compile problem on nvcc 8.0; fix thread-sync errors.
kangshiyin
reviewed
Nov 30, 2016
| ssum[tid] += ssum[tid + shift]; | ||
| __syncthreads(); | ||
| } | ||
| __syncthreads(); |
Contributor
There was a problem hiding this comment.
My bad… I missed this sync bug in unit test.
kangshiyin
reviewed
Nov 30, 2016
| void NormalizePerRow(const CuMatrixBase<Real>& in, const Real target_rms, | ||
| const bool add_log_stddev, CuMatrixBase<Real>* out) { | ||
| const Real kSquaredNormFloor = 1.35525271560688e-20; // 2^-66 | ||
| KALDI_ASSERT(SameDim(in, *out)); |
Contributor
There was a problem hiding this comment.
This does not hold if add_log_stddev is true.
|
|
||
|
|
||
| inline __device__ static float max_generic(float a, float b) { | ||
| return fmaxf(a, b); |
Contributor
There was a problem hiding this comment.
Math functions such as fmax() are already overloaded for both double and float as documented here. http://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH.html#group__CUDA__MATH
It seems that max() has been overloaded for all possible types in this header but this is not documented.
/usr/local/cuda/include/math_functions.h
Contributor
Author
|
I would have asked for review but that patch fixed a bug so I did it
quickly.
Would you mind fixing that assert problem by replacing with a more accurate
assert? No hurry-- we have no scripts that use that feature. [but the
tests were OK, so I guess we were also not testing that feature... would be
good to have a test that covers it.]
Regarding 'max'... I was not sure if that was a problem, it was my first
guess at fixing the problem and I was in too much of a hurry to test it
out. I couldn't find any documentation of the overloaded function so I
assumed it didn't exist. It would probably be good to revert it, since my
alternative is a bit ugly. You can make a PR for that.
…On Tue, Nov 29, 2016 at 9:43 PM, Shiyin Kang ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/cudamatrix/cu-math.cc
<#1228 (review)>:
> @@ -246,6 +246,7 @@ template<typename Real>
void NormalizePerRow(const CuMatrixBase<Real>& in, const Real target_rms,
const bool add_log_stddev, CuMatrixBase<Real>* out) {
const Real kSquaredNormFloor = 1.35525271560688e-20; // 2^-66
+ KALDI_ASSERT(SameDim(in, *out));
This does not hold if add_log_stddev is true.
------------------------------
In src/cudamatrix/cu-kernels.cu
<#1228 (review)>:
> @@ -28,6 +28,16 @@
#include <math_constants.h>
#include "cudamatrix/cu-kernels-ansi.h"
+
+
+inline __device__ static float max_generic(float a, float b) {
+ return fmaxf(a, b);
Math functions such as fmax() are already overloaded for both double and
float as documented here. http://docs.nvidia.com/cuda/
cuda-math-api/group__CUDA__MATH.html#group__CUDA__MATH
It seems that max() has been overloaded for all possible types in this
header but this is not documented.
/usr/local/cuda/include/math_functions.h
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#1228 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADJVuxqSKzqc8yWHqIZGKgYovQeX1k6zks5rDOLsgaJpZM4K_qpN>
.
|
Contributor
|
I will make a PR soon. I think |
Contributor
|
@kangshiyin NormalizeComponent |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…rashed on zero; compile problem on nvcc 8.0.