Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby wrapper #27

Open
v0dro opened this issue Jun 14, 2016 · 5 comments
Open

Ruby wrapper #27

v0dro opened this issue Jun 14, 2016 · 5 comments

Comments

@v0dro
Copy link

v0dro commented Jun 14, 2016

It would be great to have a Ruby wrapper over paratext to bring fast CSV reading to Ruby.

We can use it in daru, for example.

@deads
Copy link
Contributor

deads commented Jun 15, 2016

I am not very familiar with the internals of Ruby, but I am happy to provide guidance for the effort.

@v0dro
Copy link
Author

v0dro commented Jun 15, 2016

Works :)

Should we use SWIG or is it ok to use the Ruby C API directly? Any specific reasons why you used SWIG for the python bindings?

@wesm
Copy link
Collaborator

wesm commented Jun 15, 2016

SWIG will likely make your life easier if you have straightforward type mappings. I may write a Cython binding for paratext (mainly to simplify talking to other Python modules that use Cython -- you can define an internal C API via pxd files) but either way it makes for less work than a hand-coded C extension for Python at least.

@v0dro
Copy link
Author

v0dro commented Jun 16, 2016

Alright. I'll explore all the options and post them here soon. Thanks :)

@deads
Copy link
Contributor

deads commented Jun 17, 2016

Hi.

I went with SWIG because it does a lot of the hard work of parsing and validating input for you. If you write your own wrapper, you have to write a lot of boilerplated code yourself. Given a large enough codebase (100K+ lines), custom wrappers become very hard to maintain.

The Python typemap lives here: https://github.com/wiseio/paratext/blob/master/src/python. The interface file (python.i) is pretty rudimentary. It converts std::vector<int/double/string> to NumPy vectors.

%typemap(out) std::vector<int> {
  $result = (PyObject*)::build_array<std::vector<int>>($1);
}

%typemap(out) std::vector<double> {
  $result = (PyObject*)::build_array<std::vector<double>>($1);
}

%typemap(out) std::vector<size_t> {
  $result = (PyObject*)::build_array<std::vector<size_t>>($1);
}

%typemap(out) const std::vector<std::string> & {
  { auto range = std::make_pair($1->begin(), $1->end());
   $result = (PyObject*)::build_array_from_range(range);
  }
}

%typemap(out) std::vector<std::string> {
  $result = (PyObject*)::build_array<std::vector<std::string>>($1);
}

%typemap(out) ParaText::CSV::ColBasedPopulator {
  $result = (PyObject*)::build_populator<ParaText::CSV::ColBasedPopulator>($1);
}

You need to write template functions to convert from C++ containers to Ruby arrays.

Ruby::Value build_array<ContainerType>(const ContainerType &container)

You also need to write a function to convert CSV columns to Ruby arrays:

Ruby::Value build_populator<ParaText::CSV::ColBasedPopulator>(ParaText::CSV::ColBasedPopulator &populator)

This is needed to call the loader.get_column(column_index) function.

Best,
Damian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants