Sending global values to native extensions from Numba intrinsics

It can be tricky to pass values defined globally in python to code in natively built extensions when using numba’s intrinsics. This blogpost shows two possible solutions.

Background/Motivation

numba is a LLVM based compiler for python that aims to bring python performance closer to languages like C or C++. This is often complemented by python’s extension mechanisms which allows one to write extensions in C/C++ (or generally, any other compiled language), to have methods that can execute without python’s overhead.

numba provides a mechanism to introduce intrinsics, which allows one to control the LLVM IR that should be emitted when called from a function being compiled. Conceptually it works like this:

import numba
from numba.extending import intrinsic

@intrinsic
def myIntrinsic(typingctx):
    def codegen(context, builder, sig, args):
        # emit some custom LLVM IR
        ...
    # Define the types for the inputs/outputs for this function - in this
    # example, we take no args, and return none
    sig = types.none()
    return sig, codegen

@numba.jit
def myFn():
    # some code
    ...
    myIntrinsic()
    # more code
    ...

# numba.jit will produce
#   <instructions for "some code" - generated by numba>
#   <instructions hand-written in myIntrinsic>
#   <instructions for "more code" - generated by numba>

This can be used to perform direct calls of C/C++ functions by writing an intrinsic which inserts a call to the desired function. This is useful because it avoids any overhead associated with function calls in python. Let’s look at a hello world example:

// In extension.cpp
#define PY_SSIZE_T_CLEAN
#include <Python.h>

#include <iostream>

static void hello(void *obj) {
    std::cout << "Hello World" << std::endl;
}

static PyMethodDef methods[] = {
    // No methods
    {nullptr, nullptr, 0, nullptr} /* Sentinel */
};

static struct PyModuleDef module = {
    PyModuleDef_HEAD_INIT,
    "hello", /* name of module */
    nullptr,/* module documentation, may be NULL */
    -1,     /* size of per-interpreter state of the module,
               or -1 if the module keeps state in global variables. */
    methods};

PyMODINIT_FUNC PyInit_foo(void) {

  auto m = PyModule_Create(&module);

  {
    // Register the function pointer as an attribute of the extension
    PyObject *tmp = PyLong_FromVoidPtr((void *)&hello);
    PyObject_SetAttrString(m, "hello", tmp);
    Py_DECREF(tmp);
  }

  return m;
}

Now we can directly call extension.cpp::hello by doing:

import numba
from numba.core import cgutils, types
from numba.extending import intrinsic

# These libraries are used for writing the LLVM IR
import llvmlite.binding as ll
from llvmlite import ir as lir

# Import the C++ extension
import foo

ll.add_symbol("hello", foo.hello)

@intrinsic
def hello(typingctx):
    def codegen(context, builder, sig, args):
        # type of the function is void(*)()
        fnty = lir.FunctionType(lir.VoidType(), [])
        # ensure that this function is included in the LLVM module being built
        fn_typ = cgutils.get_or_insert_function(
            builder.module, fnty, name="hello"
        )
        # insert a call to `hello` in the output IR
        builder.call(fn_typ, ())
        # return None
        return context.get_dummy_value()

    sig = types.none()
    return sig, codegen

# We can only call intrinsics from compiled contexts, so wrap the function in a
# numba.jit decorator
@numba.jit
def sayhello():
    hello()

sayhello()

Using globals

There are times when you may have an object globally defined in python that needs to be accessed by your native extension, for this example, this is the global object we’ll be referring to:

class Data:
    x = 100

global_data = Data()

There are two ways we could go about exposing this to C++. Note that for this discussion, we will not require the code to work with numba’s caching. If caching is required Method 1 is the only working answer presented here. This is also a lot easier to do if you relax the constraint of doing this entirely from an intrinsic.

Method 1 - using PyImport

C++ extensions can import python modules by using PyImport_ImportModule, the code will look something like this:

static void hello_global() {
  PyObject *foo_mod = PyImport_ImportModule("foo");
  PyObject *glbl_data = PyObject_GetAttrString(foo_mod, "global_data");
  PyObject *x = PyObject_GetAttrString(glbl_data, "x");

  long val = PyLong_AsLong(x);
  std::cout << "global x: " << val << std::endl;

  Py_DECREF(x);
  Py_DECREF(glbl_data);
  Py_DECREF(foo_mod);
}

This has it’s drawbacks though, and it’s not always feasible to re-import a module from the extension code.

Method 2 - pass in the global from the intrinsic

While this method does not require any imports from the extension, it is a bit more involved on the intrinsic side. On the python side, we will need to get the address of the global we want to expose (which can be done with the built-in function id) which we then can send this over to the extension code. This may look something like:

@intrinsic
def hello_global(typingctx):
    def codegen(context, builder, sig, args):
        # note that we are taking in a parameter for hello_global now
        fnty = lir.FunctionType(lir.VoidType(), [lir.IntType(8).as_pointer()])
        fn_typ = cgutils.get_or_insert_function(
            builder.module, fnty, name="hello_global"
        )

        # Get the address of global_data and emit it as a constant in the IR
        addr = lir.Constant(lir.IntType(64), id(global_data))
        # Convert the integer constant to a pointer
        ptr = addr.inttoptr(lir.IntType(8).as_pointer())
        # Pass the pointer to the global to the extension
        builder.call(fn_typ, (ptr,))
        return context.get_dummy_value()

    # This type signature remains unchanged
    sig = types.none()
    return sig, codegen

On the C++ side we no longer need any importing:

static void hello_global(void* global_data) {
  PyObject* py_global_data = (PyObject*)global_data;
  PyObject *x = PyObject_GetAttrString(py_global_data, "x");

  long val = PyLong_AsLong(x);
  std::cout << "global x: " << val << std::endl;

  Py_DECREF(x);
}

Conclusion

Using global values within intrinsics in numba can seem daunting, and in my experience there aren’t many examples of this to go off of on the internet. Overall, the mechanisms that enable this are relatively straightforward, and the only part that really tripped me up at first was not realizing that the inttoptr call was necessary at all - if I hadn’t had prior LLVM knowledge before working with numba I would have been stuck for a lot longer. Hopefully this post will help some wayward compiler devs!

Written on April 2, 2024