Embedding Cython modules in C/C++ applications¶

This is a stub documentation page. PRs very welcome.

Quick links:

Initialising your main module¶

Most importantly, DO NOT call the module init function instead of importing the module. This is not the right way to initialise an extension module. (It was always wrong but used to work before, but since Python 3.5, it is wrong and no longer works.)

For details, see the documentation of the module init function in CPython and PEP 489 regarding the module initialisation mechanism in CPython 3.5 and later.

The PyImport_AppendInittab() function in CPython allows registering statically (or dynamically) linked extension modules for later imports. An example is given in the documentation of the module init function that is linked above.

Embedding example code¶

The following is a simple example that shows the main steps for embedding a Cython module (embedded.pyx) in Python 3.x.

First, here is a Cython module that exports a C function to be called by external code. Note that the say_hello_from_python() function is declared as public to export it as a linker symbol that can be used by other C files, which in this case is embedded_main.c.

# embedded.pyx

# The following two lines are for test purposes only, please ignore them.
# distutils: sources = embedded_main.c
# tag: py3only
# tag: no-cpp

TEXT_TO_SAY = 'Hello from Python!'

cdef public int say_hello_from_python() except -1:
    print(TEXT_TO_SAY)
    return 0

The C main() function of your program could look like this:

 1/* embedded_main.c */
 2
 3/* This include file is automatically generated by Cython for 'public' functions. */
 4#include "embedded.h"
 5
 6#ifdef __cplusplus
 7extern "C" {
 8#endif
 9
10int
11main(int argc, char *argv[])
12{
13    PyObject *pmodule;
14    wchar_t *program;
15
16    program = Py_DecodeLocale(argv[0], NULL);
17    if (program == NULL) {
18        fprintf(stderr, "Fatal error: cannot decode argv[0], got %d arguments\n", argc);
19        exit(1);
20    }
21
22    /* Add a built-in module, before Py_Initialize */
23    if (PyImport_AppendInittab("embedded", PyInit_embedded) == -1) {
24        fprintf(stderr, "Error: could not extend in-built modules table\n");
25        exit(1);
26    }
27
28    /* Pass argv[0] to the Python interpreter */
29    Py_SetProgramName(program);
30
31    /* Initialize the Python interpreter.  Required.
32       If this step fails, it will be a fatal error. */
33    Py_Initialize();
34
35    /* Optionally import the module; alternatively,
36       import can be deferred until the embedded script
37       imports it. */
38    pmodule = PyImport_ImportModule("embedded");
39    if (!pmodule) {
40        PyErr_Print();
41        fprintf(stderr, "Error: could not import module 'embedded'\n");
42        goto exit_with_error;
43    }
44
45    /* Now call into your module code. */
46    if (say_hello_from_python() < 0) {
47        PyErr_Print();
48        fprintf(stderr, "Error in Python code, exception was printed.\n");
49        goto exit_with_error;
50    }
51
52    /* ... */
53
54    /* Clean up after using CPython. */
55    PyMem_RawFree(program);
56    Py_Finalize();
57
58    return 0;
59
60    /* Clean up in the error cases above. */
61exit_with_error:
62    PyMem_RawFree(program);
63    Py_Finalize();
64    return 1;
65}
66
67#ifdef __cplusplus
68}
69#endif

(Adapted from the CPython documentation.)

Instead of writing such a main() function yourself, you can also let Cython generate one into your module’s C file with the cython --embed option. Or use the cython_freeze script to embed multiple modules. See the embedding demo program for a complete example setup.

Be aware that your application will not contain any external dependencies that you use (including Python standard library modules) and so may not be truly portable. If you want to generate a portable application we recommend using a specialized tool (e.g. PyInstaller or cx_freeze) to find and bundle these dependencies.

Troubleshooting¶

Here’s some of the things that can go wrong when embedding Cython code.

Not initializing the Python interpreter¶

Cython doesn’t compile to “pure stand-alone C code”. Instead Cython compiles to a bunch of Python C API calls that depend on the Python interpreter. Therefore, in your main function you must initialize the Python interpreter with Py_Initialize(). You should do this as early as possible in your main() function.

Very occasionally you may get away without it, for exceptionally simple programs. This is pure luck, and you should not rely on it. There is no “safe subset” of Cython that’s designed to run without the interpreter.

The consequence of not initializing the Python interpreter is likely to be crashes.

You should only initialize the interpreter once - a lot of modules, including most Cython modules and Numpy, don’t currently like being imported multiple times. Therefore if you’re doing occasional Python/Cython calculations in a larger program what you don’t do is:

void run_calculation() {
     Py_Initialize();
     // Use Python/Cython code
     Py_Finalize();
}

The chances are you will get mystery unexplained crashes.

Not setting the Python path¶

If your module imports anything (and possibly even if it doesn’t) then it’ll need the Python path set so it knows where to look for modules. Unlikely the standalone interpreter, embedded Python doesn’t set this up automatically.

PySys_SetPath(...) is the easiest way of doing this (just after Py_Initialize() ideally). You could also use PySys_GetObject("path") and then append to the list that it returns.

if you forget to do this you will likely see import errors.

Not importing the Cython module¶

Cython doesn’t create standalone C code - it creates C code that’s designed to be imported as a Cython module. The “import” function sets up a lot of the basic infrastructure necessary for you code to run. For example, strings are initialized at import time, and built-ins like print are found and stashed within your Cython module.

Therefore, if you decide to skip the initialization and just go straight to running your public functions you will likely experience crashes (even for something as simple as using a string).

InitTab¶

The preferred way to set up an extension module so that it’s available for import in modern Python (>=3.5) is to use the inittab mechanism which is detailed in elsewhere in the documentation. This should be done before Py_Initialize().

Forcing single-phase¶

If for some reason you aren’t able to add your module to the inittab before Python is initialized (a common reason is trying to import another Cython module built into a single shared library) then you can disable multi-phase initialization by defining CYTHON_PEP489_MULTI_PHASE_INIT=0 for your C compiler (for gcc this would be -DCYTHON_PEP489_MULTI_PHASE_INIT=0 at the command line). If you do this then you can run the module init function directly (PyInit_<module_name> on Python 3). This really isn’t the preferred option.

Working with multi-phase¶

It is possible to run the multi-phase initialization manually yourself. One of the Cython developers has written a guide showing how to do this. However, he considers it sufficiently hacky that it is only linked here, and not reproduced directly. It is an option though, if you’re unable to use the inittab mechanism before initializing the interpreter.

Problems with multiprocessing and pickle¶

If you try to use multiprocessing while using a Cython module embedded into an executable it will likely fail with errors related to the pickle module. multiprocessing often uses pickle to serialize and deserialize data to be run in another interpreter. What happens depends on the multiprocessing “start method”. However, on the “spawn” start method used on Windows, it starts a fresh copy of the Python interpreter (rather than a fresh copy of your embedded program) and then tries to import your Cython module. Since your Cython module is only available by the inittab mechanism and not be a regular import then that import fails.

The solution likely involves setting multiprocessing.set_executable to point to your embedded program then modifying that program to handle the --multiprocessing-fork command-line argument that multiprocessing passes to the Python interpreter. You may also need to call multiprocessing.freeze_support().

At the moment that solution is untested so you should treat multiprocessing from an embedded Cython executable as unsupported.