-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Tokens in Swift and Kotlin #227
Conversation
|
||
// Pointer to continuous memory which holds string based tokens | ||
// which are seperated by \0 | ||
const char *tokens; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that we define
const char *_tokens;
const char *const*tokens;
where _tokens
is your current tokens
And the new char **tokens
is a pointer array.
tokens[0]
contains the address of the first token
in
_tokens
. tokens[1]
contains the address of the second
token
in _tokens
.
In this way, it simplifies users' life as they only need to iterate
char **tokens
.
for (int32 i = 0; i != count; ++i) {
const char*t = tokens[i];
// process this token
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@csukuangfj I added const char *const *tokensArr;
for easy to implement, and I also tested
sherpa-onnx/c-api/c-api.cc
Outdated
} else { | ||
r->count = 0; | ||
r->timestamps = nullptr; | ||
r->tokens = nullptr; | ||
r->tokensArr = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add code to free it to avoid memory leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@csukuangfj sorry, what do you want to free? This line is just to initialize null for pointer incase there is no token from SherpaOnnxOnlineRecognizerResult
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change
sherpa-onnx/sherpa-onnx/c-api/c-api.cc
Lines 179 to 182 in 48038a7
void DestroyOnlineRecognizerResult(const SherpaOnnxOnlineRecognizerResult *r) { | |
delete[] r->text; | |
delete r; | |
} |
to
void DestroyOnlineRecognizerResult(const SherpaOnnxOnlineRecognizerResult *r) {
delete[] r->text;
delete[] r->json;
delete[] r->tokensArr;
delete r;
}
By the way, please change the variable names:
totalLength
->total_length
tokensTemp
->tokens_temp
tokensArr
->tokens_arr
to follow the code convention in sherpa-onnx.
sherpa-onnx/c-api/c-api.cc
Outdated
r->timestamps[i] = result.timestamps[i]; | ||
} | ||
|
||
r->tokensArr = const_cast<const char *const *>(tokensTemp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r->tokensArr = const_cast<const char *const *>(tokensTemp); | |
r->tokensArr = tokensTemp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm done
return r; | ||
} | ||
|
||
void DestroyOnlineRecognizerResult(const SherpaOnnxOnlineRecognizerResult *r) { | ||
delete[] r->text; | ||
delete[] r->json; | ||
delete[] r->tokens; | ||
delete[] r->tokens_arr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also delete
delete[] r->timestamps;
char **tokens_temp = new char*[r->count]; | ||
int pos = 0; | ||
for (int32_t i = 0; i < r->count; ++i) { | ||
tokens_temp[i] = const_cast<char*>(r->tokens) + pos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tokens_temp[i] = const_cast<char*>(r->tokens) + pos; | |
tokens_temp[i] = r->tokens + pos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for this casting const_cast<char*>
is build error:
error: assigning to 'char *' from 'const char *' discards qualifiers
tokens_temp[i] = r->tokens + pos;
sherpa-onnx/c-api/c-api.cc
Outdated
total_length); | ||
r->timestamps = new float[r->count]; | ||
char **tokens_temp = new char*[r->count]; | ||
int pos = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int pos = 0; | |
int32_t pos = 0; |
sherpa-onnx/c-api/c-api.h
Outdated
const char *const *tokens_arr; | ||
|
||
// Pointer to continuous memory which holds timestamps which | ||
// are seperated by \0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the comment. It is not separated by \0 for timestamps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Thanks for your contribution! |
Implemented tokens in Swift and Kotlin API. Tested on iOS and Android.