Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
mityax committed Mar 25, 2022
0 parents commit 0fbe173
Show file tree
Hide file tree
Showing 31 changed files with 1,527 additions and 0 deletions.
2 changes: 2 additions & 0 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[run]
source=cppimport
14 changes: 14 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
*.swp
*.pyc
*.so
*.cppimporthash
.rendered.*
__pycache__
build
cppimport.egg-info
dist
.cache
.tox
.mypy_cache/
.coverage
htmlcov
15 changes: 15 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
repos:
- repo: https://github.com/psf/black
rev: 20.8b1
hooks:
- id: black
language_version: python3
- repo: https://gitlab.com/pycqa/flake8
rev: 3.8.4
hooks:
- id: flake8
- repo: https://github.com/pycqa/isort
rev: 5.7.0
hooks:
- id: isort
name: isort (python)
41 changes: 41 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Contributing

When contributing to this repository, feel free to add an issue or pull request! There really aren't any rules, but if you're mean, I'll be sad. I'm happy to collaborate on pull requests if you would like. There's no need to submit a perfect, finished product.

To install in development mode and run the tests:
```
git clone [email protected]:tbenthompson/cppimport.git
cd cppimport
conda env create
conda activate cppimport
pre-commit install
pip install --no-use-pep517 --disable-pip-version-check -e .
pytests
```

# Architecture

## Entrypoints:

The main entrypoint for cppimport is the `cppimport.import_hook` module, which interfaces with the Python importing system to allow things like `import mycppfilename`. For a C++ file to be a valid import target, it needs to have the word `cppimport` in its first line. Without this first line constraint, it is possible for the importing system to cause imports in other Python packages to fail. Before adding the first-line constraint, the cppimport import_hook had the unfortunate consequence of breaking some scipy modules that had adjacent C and C++ files in the directory tree.

There is an alternative, and more explicit interface provided by the `imp`, `imp_from_filepath` and `build` functions here.
* `imp` does exactly what the import hook does except via a function so that instead of `import foomodule` we would do `foomodule = imp('foomodule')`.
* `imp_from_filepath` is even more explicit, allowing the user to pass a C++ filepath rather than a modulename. For example, `foomodule = imp('../cppcodedir/foodmodule.cpp')`. This is rarely necessary but can be handy for debugging.
* `build` is similar to `imp` except that the library is only built and not actually loaded as a Python module.

`imp`, `imp_from_filepath` and `build` are in the `__init__.py` to separate external facing API from the guts of the package that live in internal submodules.

## What happens when we import a C++ module.

1. First the `cppimport.find.find_module_cpppath` function is used to find a C++ file that matches the desired module name.
2. Next, we determine if there's already an existing compiled extension that we can use. If there is, the `cppimport.importer.is_build_needed` function is used to determine if the extension is up to date with the current code. If the extension is up to date, we attempt to load it. If the extension is loaded successfully, we return the module and we're done! However, if for whichever reason, we can't load an existing extension, we need to build the extension, a process directed by `cppimport.importer.template_and_build`.
3. The first step of building is to run the C++ file through the Mako templating system with the `cppimport.templating.run_templating` function. The main purpose of this is to allow users to embed configuration information into their C++ file. Without some sort of similar mechanism, there would be no way of passing information to build system because the `import modulename` statement can't carry information. The templating serves a secondary benefit in that simple code generation can be performed if needed. However, most users probably stick to a simple header or footer similar to the one demonstrated in the README.
4. Next, we use setuptools to build the C++ extension using `cppimport.build_module.build_module`. This function calls setuptools with the appropriate arguments to build the extension in place next to the C++ file in the directory tree.
5. Next, we call `cppimport.checksum.checksum_save` to add a hash of the appended contents of all relevant source and header files. This checksum is appended to the end of the `.so` or `.dylib` file. This seems legal according to specifications and, in practice, causes no problems.
6. Finally, the compiled and loaded extension module is returned to the user.

## Useful links

* PEP 302 that made this possible: https://www.python.org/dev/peps/pep-0302/
* The gory details of Python importing: https://docs.python.org/3/reference/import.html
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
The MIT License (MIT)

Copyright (c) 2021 T. Ben Thompson

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
3 changes: 3 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
include README.md
include VERSION
recursive-include tests *
196 changes: 196 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
# cppimport - Import C++ directly from Python!

<p align=center>
<a target="_blank" href="https://www.python.org/downloads/" title="Python version"><img src="https://img.shields.io/badge/python-%3E=_3.6-green.svg"></a>
<a target="_blank" href="https://pypi.org/project/cppimport/" title="PyPI version"><img src="https://img.shields.io/pypi/v/cppimport?logo=pypi"></a>
<a target="_blank" href="https://pypi.org/project/cppimport/" title="PyPI"><img src="https://img.shields.io/pypi/dm/cppimport"></a>
<a target="_blank" href="LICENSE" title="License: MIT"><img src="https://img.shields.io/badge/License-MIT-blue.svg"></a>
<a target="_blank" href="https://github.com/tbenthompson/cppimport/actions" title="Test Status"><img src="https://github.com/tbenthompson/cppimport/actions/workflows/test.yml/badge.svg"></a>
<a target="_blank" href="https://codecov.io/gh/tbenthompson/cppimport" title="Code coverage"><img src="https://codecov.io/gh/tbenthompson/cppimport/branch/stable/graph/badge.svg?token=GWpX62xMt5"/></a>
</a>

</p>

## Contributing and architecture

See [CONTRIBUTING.md](CONTRIBUTING.md) for details on the internals of `cppimport` and how to get involved in development.

## Installation

Install with `pip install cppimport`.

## A quick example

Save the C++ code below as `somecode.cpp`.
```c++
// cppimport
#include <pybind11/pybind11.h>

namespace py = pybind11;

int square(int x) {
return x * x;
}

PYBIND11_MODULE(somecode, m) {
m.def("square", &square);
}
/*
<%
setup_pybind11(cfg)
%>
*/
```
Then open a Python interpreter and import the C++ extension:
```python
>> > import rustimport.import_hook
>> > import somecode # This will pause for a moment to compile the module
>> > somecode.square(9)
81
```

Hurray, you've called some C++ code from Python using a combination of `cppimport` and [`pybind11`](https://github.com/pybind/pybind11).

I'm a big fan of the workflow that this enables, where you can edit both C++ files and Python and recompilation happens transparently! It's also handy for quickly whipping together an optimized version of a slow Python function.

## An explanation

Okay, now that I've hopefully convinced you on how exciting this is, let's get into the details of how to do this yourself. First, the comment at top is essential to opt in to cppimport. Don't forget this! (See below for an explanation of why this is necessary.)
```c++
// cppimport
```

The bulk of the file is a generic, simple [pybind11](https://github.com/pybind/pybind11) extension. We include the `pybind11` headers, then define a simple function that squares `x`, then export that function as part of a Python extension called `somecode`.

Finally at the end of the file, there's a section I'll call the "configuration block":
```
<%
setup_pybind11(cfg)
%>
```
This region surrounded by `<%` and `%>` is a [Mako](https://www.makotemplates.org/) code block. The region is evaluated as Python code during the build process and provides configuration info like compiler and linker flags to the cppimport build system.

Note that because of the Mako pre-processing, the comments around the configuration block may be omitted. Putting the configuration block at the end of the file, while optional, ensures that line numbers remain correct in compilation error messages.

## Building for production
In production deployments you usually don't want to include a c/c++ compiler, all the sources and compile at runtime. Therefore, a simple cli utility for pre-compiling all source files is provided. This utility may, for example, be used in CI/CD pipelines.

Usage is as simple as

```commandline
python -m cppimport build
```

This will build all `*.c` and `*.cpp` files in the current directory (and it's subdirectories) if they are eligible to be imported (i.e. contain the `// cppimport` comment in the first line).

Alternatively, you may specifiy one or more root directories or source files to be built:

```commandline
python -m cppimport build ./my/directory/ ./my/single/file.cpp
```
_Note: When specifying a path to a file, the header check (`// cppimport`) is skipped for that file._

### Fine-tuning for production
To further improve startup performance for production builds, you can opt-in to skip the checksum and compiled binary existence checks during importing by either setting the environment variable `CPPIMPORT_RELEASE_MODE` to `true` or setting the configuration from within Python:
```python
cppimport.settings['release_mode'] = True
```
**Warning:** Make sure to have all binaries pre-compiled when in release mode, as importing any missing ones will cause exceptions.

## Frequently asked questions

### What's actually going on?

Sometimes Python just isn't fast enough. Or you have existing code in a C or C++ library. So, you write a Python *extension module*, a library of compiled code. I recommend [pybind11](https://github.com/pybind/pybind11) for C++ to Python bindings or [cffi](https://cffi.readthedocs.io/en/latest/) for C to Python bindings. I've done this a lot over the years. But, I discovered that my productivity is slower when my development process goes from *Edit -> Test* in just Python to *Edit -> Compile -> Test* in Python plus C++. So, `cppimport` combines the process of compiling and importing an extension in Python so that you can just run `import foobar` and not have to worry about multiple steps. Internally, `cppimport` looks for a file `foobar.cpp`. Assuming one is found, it's run through the Mako templating system to gather compiler options, then it's compiled and loaded as an extension module.

### Does cppimport recompile every time a module is imported?
No! Compilation should only happen the first time the module is imported. The C++ source is compared with a checksum on each import to determine if any relevant file has changed. Additional dependencies (e.g. header files!) can be tracked by adding to the Mako header:
```python
cfg['dependencies'] = ['file1.h', 'file2.h']
```
The checksum is computed by simply appending the contents of the extension C++ file together with the files in `cfg['sources']` and `cfg['dependencies']`.

### How can I set compiler or linker args?

Standard distutils configuration options are valid:

```python
cfg['extra_link_args'] = ['...']
cfg['extra_compile_args'] = ['...']
cfg['libraries'] = ['...']
cfg['include_dirs'] = ['...']
```

For example, to use C++11, add:
```python
cfg['extra_compile_args'] = ['-std=c++11']
```

### How can I split my extension across multiple source files?

In the configuration block:
```python
cfg['sources'] = ['extra_source1.cpp', 'extra_source2.cpp']
```

### cppimport isn't doing what I want, can I get more verbose output?
`cppimport` uses the standard Python logging tools. Please add logging handlers to either the root logger or the `"cppimport"` logger. For example, to output all debug level log messages:

```python
root_logger = logging.getLogger()
root_logger.setLevel(logging.DEBUG)

handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.DEBUG)
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
handler.setFormatter(formatter)
root_logger.addHandler(handler)
```

### How can I force a rebuild even when the checksum matches?

Set:
```python
cppimport.settings['force_rebuild'] = True
```

And if this is a common occurence, I would love to hear your use case and why the combination of the checksum, `cfg['dependencies']` and `cfg['sources']` is insufficient!

### How can I get information about filepaths in the configuration block?
The module name is available as the `fullname` variable and the C++ module file is available as `filepath`.
For example,
```
<%
module_dir = os.path.dirname(filepath)
%>
```

### How can I make compilation faster?

In single file extensions, this is a fundamental issue with C++. Heavily templated code is often quite slow to compile.

If your extension has multiple source files using the `cfg['sources']` capability, then you might be hoping for some kind of incremental compilation. For the uninitiated, incremental compilation involves only recompiling those source files that have changed. Unfortunately this isn't possible because cppimport is built on top of the setuptools and distutils and these standard library components do not support incremental compilation.

I recommend following the suggestions on [this SO answer](http://stackoverflow.com/questions/11013851/speeding-up-build-process-with-distutils). That is:

1. Use `ccache` to reduce the cost of rebuilds
2. Enable parallel compilation. This can be done with `cfg['parallel'] = True` in the C++ file's configuration header.

As a further thought, if your extension has many source files and you're hoping to do incremental compiles, that probably indicates that you've outgrown `cppimport` and should consider using a more complete build system like CMake.

### Why does the import hook need "cppimport" on the first line of the .cpp file?
Modifying the Python import system is a global modification and thus affects all imports from any other package. As a result, when I first implemented `cppimport`, other packages (e.g. `scipy`) suddenly started breaking because import statements internal to those packages were importing C or C++ files instead of the modules they were intended to import. To avoid this failure mode, the import hook uses an "opt in" system where C and C++ files can specify they are meant to be used with cppimport by having a comment on the first line that includes the text "cppimport".

As an alternative to the import hook, you can use `imp` or `imp_from_filepath`. The `cppimport.imp` and `cppimport.imp_from_filepath` performs exactly the same operation as the import hook but in a slightly more explicit way:
```
foobar = cppimport.imp("foobar")
foobar = cppimport.imp_from_filepath("src/foobar.cpp")
```
By default, these explicit function do not require the "cppimport" keyword on the first line of the C++ source file.

### Windows?
The CI system does not run on Windows. A PR would be welcome adding further Windows support. I've used `cppimport` with MinGW-w64 and Python 3.6 and had good success. I've also had reports that `cppimport` works on Windows with Python 3.6 and Visual C++ 2015 Build Tools. The main challenge is making sure that distutils is aware of your available compilers. Try out the suggestion [here](https://stackoverflow.com/questions/3297254/how-to-use-mingws-gcc-compiler-when-installing-python-package-using-pip).

## cppimport uses the MIT License
1 change: 1 addition & 0 deletions VERSION
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
21.03.07
5 changes: 5 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[tool.isort]
profile = "black"

[tool.pytest.ini_options]
addopts = "-s --tb=short"
21 changes: 21 additions & 0 deletions release
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
PUBLISH TO PYPI:
update the version number in VERSION with calver: yy.mm.dd
python setup.py sdist upload -r pypi

SANITY TEST:
open new terminal
mamba create -n testenv python=3 pip
conda activate testenv
pip install cppimport
cd tests
python -c 'import cppimport; assert(cppimport.imp("mymodule").add(1,2) == 3);'

GIT:
git commit -m "yy.mm.dd"
git push
git tag yy.mm.dd
git push origin yy.mm.dd
create release on github

TODO:
automate most of this via github actions
6 changes: 6 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
black
flake8
isort
pytest
pytest-cov
pre-commit
Loading

0 comments on commit 0fbe173

Please sign in to comment.