add pybind for fst::StdVectorFst#88
Conversation
|
This is amazing! |
|
Before I merge: we need to figure out what we'll do to wrap lattices. Those will be needed at some point. Would it make sense to template-ize some of the wrapping code? LMK what you think. |
|
I agree. Templates would be helpful when we wrap |
| @@ -0,0 +1,111 @@ | |||
| #!/usr/bin/env python | |||
There was a problem hiding this comment.
Just a small detail -- wouldn't be better to have this as
#!/usr/bin/env python3
?
There was a problem hiding this comment.
Yes, I agree with you, but I am not sure whether kaldi pybind supports only Python3.
Python 2.7 will not be maintained past 2020.
Is this a good time to switch to Python3 and to support only Python3 in the future for kaldi pybind?
|
I think so... the Makefile in pybind branch explicitly calls python3-config
and python3, so I think it's safe to assume the module will be python3 i.e.
you will need python3 to import it.
And generally, for new scripts, we ask for python3 scripts in new PRs in
kaldi in general.
y.
…On Thu, Dec 19, 2019 at 9:20 AM Fangjun Kuang ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/pybind/fst/symbol_table_pybind_test.py
<#88 (comment)>:
> @@ -0,0 +1,111 @@
+#!/usr/bin/env python
Yes, I agree with you, but I am not sure whether kaldi pybind supports
only Python3.
Python 2.7 will not be maintained past 2020.
Is this a good time to switch to Python3 and to support only Python3 in
the future for kaldi pybind?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#88?email_source=notifications&email_token=ACUKYXY4YLKXKKLYQXHZI43QZMVGVA5CNFSM4J42LA52YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCPXZDCY#discussion_r359733996>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACUKYX2ZWHPNQK7KW4ABROTQZMVGVANCNFSM4J42LA5Q>
.
|
|
thanks. I will change all python scripts to python3. |
|
an initial version for for usage. |
|
I'll fix the confilicts soon. |
a8715e1 to
e6610bb
Compare
We are using `__iter__` method in Python, which calls Next() immediately when Value() is returned. This problem does not exist in C++. We use copy semantics currently to solve this problem.
|
We can always refactor the current OpenFst Pybind11 code to use templates. I guess the current priority is to make LF-MMI working as soon as possible in kaldi pybind When the training pipeline is finished, we can consider to convert OpenFst wrapper |
|
OK that sounds good.
…On Fri, Dec 20, 2019 at 9:54 AM Fangjun Kuang ***@***.***> wrote:
We can always refactor the current OpenFst Pybind11 code to use templates.
I guess the current *priority* is to make LF-MMI working as soon as
possible in kaldi pybind
and potentially with PyTorch. As LF-MMI requires only fst::StdVectorFst,
we wrapped it
and its dependecies with non-template code, since this approach is much
faster.
When the training pipeline is finished, we can consider to convert OpenFst
wrapper
related code to templates to wrap Lattices so that we can also perform
decoding in Python.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#88?email_source=notifications&email_token=AAZFLO2SEP3Q4Y5VUSTAZOTQZQQXDA5CNFSM4J42LA52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHLTZ5Y#issuecomment-567753975>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO4CLWDUKUO7TU3GWWTQZQQXDANCNFSM4J42LA5Q>
.
|
| "non-const to enable things like shallow swap on the held object in " | ||
| "situations where this would avoid making a redundant copy.", | ||
| py::return_value_policy::reference) | ||
| py::return_value_policy::copy) |
There was a problem hiding this comment.
I am a little concerned about the waste here.
I think it would be more efficient to have the iterator call Next() before providing the value, but have a boolean member to see whether this is the first time next() was called, and if so, skip calling Next().
|
Mergng this to keep things moving, you can fix the iterator thing later. |
At Mobvoi AI Lab in Beijing, we are trying to wrap Kaldi with Pybind11
so that we can reuse the components from Kaldi instead of reinventing the wheel.
We have a pipeline for LF-MMI AM training,
where the neural network part is from PyTorch and the loss function
is from Kaldi wrapped by Pybind11. The Fsts required by the loss function
are currently read in C++ from an
rxfilenamepassed from the Python side.The Fsts can be passed from Python to C++ directly once we have wrapped
FST to Python and we will open-source the updated loss function in the future.
There are existing python bindings for OpenFst, such as Pykaldi (using Clif) and
the official python binding (using Cython). So why do we need another one?
Since we use Pybind11 for Kaldi, it is very difficult, if not impossible, to pass a Python
object not wrapped by Pybind11 to the C++ part. Therefore, we need another python
binding for OpenFST.
As Kaldi is also going to be wrapped by Pybind11, we, at Mobvoi, would like to
open source what we have done and hope it would be beneficial to the community.
The python binding for OpenFst has not yet been finished and is still in progress.
We hope this pullrequrest would be a good starting point.