nim-pymod

by jboy

Auto-generate a Python module that wraps a Nim module.

203 Stars 11 Forks Last release: Not found Other 212 Commits 0 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:

Pymod

Auto-generate a Python module that wraps a Nim programming language module.

The Pymod software consists of Nim bindings & Python scripts to automate the generation of Python C-API extension module boilerplate for Nim "procs" (procedures). After the Pymod script has been run, there will be an auto-generated, auto-compiled Python module that exposes the Nim procs in Python.

There's even a type (

PyArrayObject
) that provides a Nim interface to Numpy arrays, so you can pass Numpy arrays into your Nim procs from Python and access them natively.

The auto-generated C-API boilerplate code handles the parsing & type-checking of the function arguments passed from Python, including correct handling of Python ref-counts if a type error occurs or an exception is raised. The boilerplate code also translates Nim exceptions (including back-traces) to Python exceptions. The boilerplate code even includes auto-generated Python docstrings that have been extracted from the Nim procs.

Pymod is definitely still in the in-development phase of software maturity, and it's far from feature-complete, but it's been usable for our work for about 9 months now (and we've been using it regularly during that time). There's a lot of hacky code in there, but it gets the job done.

Table of contents

  1. Motivation
  2. Nim
  3. Example
  4. Usage
  5. System requirements
  6. Per-project configuration
  7. Procedure parameter & return types
  8. Docstrings
  9. PyArrayObject type
  10. PyArrayIter types
  11. PyArrayObject & PyArrayIter usage example
  12. PyArrayIter loop idioms
  13. Tips, warnings & gotchas
  14. What about calling Python from Nim?
  15. Implementation details

Motivation

Perhaps you have a large body of existing Python code, that you can't or don't want to rewrite. Perhaps you want to use Numpy, Scipy or Matplotlib. Perhaps your program's main loop simply must be in Python.

However, you would like to write Nim code and then call your Nim procs from your Python code. (There are many more systems written in Python than in Nim, but it would be great to start extending them in Nim!) You can write Python C-API extension modules to wrap your Nim procs, but all the C-API boilerplate is a huge drag, especially if you check types and manage reference counts and handle Nim exceptions properly.

That's what Pymod is for.

Nim

If you'd like to learn more about the Nim programming language, we recommend:

Example

Here's a short "Hello world" example (assumed to be in a file called

greeting.nim
):
## Compile this Nim module using the following command:
##   python path/to/pmgen.py greeting.nim

import strutils # % operator

import pymod import pymodpkg/docstrings

proc greet*(audience: string): string {.exportpy.} = docstring"""Greet the specified audience with a familiar greeting.

The string returned will be a greeting directed specifically at that audience. """ return "Hello, $1!" % audience

initPyModule("hw", greet)

Use the Python script

pmgen.py
to auto-generate & compile the boilerplate code:
python path/to/pmgen.py greeting.nim

There will now be a compiled Python extension module

hw.so
in the current directory. (It is called
hw
because that is the name that was specified in the
initPyModule()
macro).

In a Python interpreter, you can import the module and invoke the

greet
function:
>>> import hw
>>> hw.greet

>>> hw.greet("World")
'Hello, World!'
>>>

You can also invoke the built-in Python interpreter

help
function about the
greet
function:
>>> help(hw.greet)
Help on built-in function greet in module hw:

greet(...) greet(audience: str) -> (str)

Parameters
----------
audience : str -> string

Returns
-------
out : (str) 

There is additional example code in the examples directory.

Usage

Using Pymod is a 4-step process. In brief:

  1. import pymod
    at the top of your Nim module.
  2. Add the
    {.exportpy.}
    pragma after each Nim proc.
  3. Invoke the
    initPyModule("modname", proc1, proc2, proc3)
    macro at the bottom of your Nim module.
  4. Run the
    pmgen.py
    Python script to compile everything.

In more detail:

  1. At the top of your Nim module, import the module
    pymod
    .
    • Tip: You might additionally wish to import
      pymodpkg/docstrings
      (to enable Python-like docstrings) and/or
      pymodpkg/pyarrayobject
      (to enable the
      PyArrayObject
      type that corresponds to Numpy's array type).
  2. In your Nim module, annotate the
    {.exportpy.}
    pragma onto each Nim proc to be exported to Python. (This pragma is named by analogy with the standard Nim
    {.exportc.}
    pragma).
  3. At the end of your Nim module, configure the Python module to be generated, using the
    initPyModule()
    macro: Specify the desired Python module name as a string (without a filename suffix), followed by the names of the Nim procs that should be compiled into the Python module.
    • For example, if you supply the string
      "foo"
      as the first argument to
      initPyModule()
      , the generated Python module will be called
      foo.so
      .
    • Tip: You can use the
      initPyModule()
      macro multiple times at the end of your Nim module, with different Python module names & different combinations of Nim procs, to generate multiple Python modules.
    • Tip: If you specify the empty string
      ""
      as the Python module name, the generated Python module will have the same name as the Nim module, but with an underscore
      "_"
      prepended. So for example, a Nim module
      bar.nim
      would be compiled to a Python module
      _bar.so
      .
  4. Invoke the supplied Python script
    pmgen.py
    , supplying the filename of your Nim module as a command-line argument, to auto-generate & invoke a set of Makefiles that will in turn initiate & run the Pymod process.

When the script

pmgen.py
is run, it will create a subdirectory
pmgen
in the current directory. All the source code auto-generated by Pymod will be placed into this subdirectory and compiled. At the end of the compilation process, the new Python module, a
.so
(shared object) file, will be moved back into the current directory.

Note that the

{.exportpy.}
pragma &
initPyModule()
macro are inert by default (that is, they have no effect), so you can add them to existing Nim code without changing the default operation of that Nim code. It's only when you run the script
pmgen.py
(which, among other actions, supplies the switch
--define:pmgen
to the Nim compiler) that the
{.exportpy.}
pragma &
initPyModule()
macro are activated.

System requirements

Per-project configuration

If there is a file

pymod.cfg
in the same directory as the Nim module you want to wrap, Pymod will read this as a configuration file for that project.

By default, Pymod runs the Nim compiler in non-release mode, and additionally performs per-dereference bounds-checking of the

PyArrayObject
iterators. This is safe (and catches all sorts of pesky bugs!) but slow.

If the file

pymod.cfg
in the current directory contains the following directives:
[all]
nimSetIsRelease: true

then the Nim compiler will be invoked in release mode, and bounds-checking of the

PyArrayObject
iterators will be switched off. Your code will now run much faster!

Procedure parameter & return types

The following Nim types are currently supported by Pymod:

| Type family | Nim types | Python2 type | Python3 type | | ---------------- | --------- | ------------ | -------------| | floating-point |

float
,
float32
,
float64
,
cfloat
,
cdouble
|
float
|
float
| | signed integer |
int
,
int16
,
int32
,
int64
,
cshort
,
cint
,
clong
|
int
|
int
| | unsigned integer |
uint
,
uint8
,
uint16
,
uint32
,
uint64
,
cushort
,
cuint
,
culong
,
byte
|
int
|
int
| | non-unicode character |
char
,
cchar
|
str
|
bytes
| | string |
string
|
str
|
str
| | Numpy array |
ptr PyArrayObject
|
numpy.ndarray
|
numpy.ndarray
|

Support for the following Nim types is in development:

| Type family | Nim types | Python2 type | Python3 type | | ---------------- | --------- | ------------ | -------------| | signed integer |

int8
|
int
|
int
| | boolean |
bool
|
bool
|
bool
| | unicode code point (character) |
unicode.Rune
|
unicode
|
str
| | non-unicode character sequence |
seq[char]
|
str
|
bytes
| | unicode code point sequence |
seq[unicode.Rune]
|
unicode
|
str
| | sequence of a single type T |
seq[T]
|
list
|
list
|

Procedure parameters may be any of the above supported Nim types. Default parameters are supported to a limited extent, although the parameter type must be specified explicitly, and is currently restricted to the

string
, integer & floating-point types.

Procedure return values may be any of the above supported Nim types or a Nim tuple of any of these types. Nested tuples are currently not supported. By default, named tuples in Nim are returned as raw tuples to Python:

# Nim                   # Python
tuple[ a, b: int ]  =>  (a_value, b_value)

If

{.exportpy.}
is specified as
{.exportpy returnDict.}
then the generated code will instead return a Python dict containing the named properties:
# Nim                   # Python
tuple[ a, b: int ]  =>  { "a": a_value, "b": b_value }

You can tell Pymod about additional Nim types using the

definePyObjectType()
macro. This will include your additional type-mapping in Pymod's type-mapping registry, similar to how Pymod maps its own
PyArrayObject
type to Numpy's array type. This provides Pymod with a mapping from an already-defined Nim type to the corresponding Python & C-API types, enabling Pymod to generate the Nim-Python conversions & type-checking boilerplate for additional types.

Docstrings

Pymod will also auto-generate a Python docstring for each function in the extension module, specifying the function's parameter types & return type, based upon the parameter types & return types of the exported Nim proc. You can embed additional documentation in each Nim proc you want to export, using the supplied

docstring"""Text goes here."""
string type. Any docstrings in the proc will be extracted automatically and included in the generated Python docstring. There is an example of docstring usage in the code sample above.

PyArrayObject type

Pymod provides the

PyArrayObject
type to allow Python code to pass Numpy ndarrays into Nim procs with appropriate type-safety. To access the
PyArrayObject
Nim type definition, import
pymodpkg/pyarrayobject
in your Nim module (after you have already imported
pymod
).

Because the Numpy array object was allocated in Python, the type of the Nim proc parameter or return value is

ptr PyArrayObject
. Note that it is a Nim
ptr
, not a Nim
ref
. Your code should pass around
ptr PyArrayObject
.

Pymod also wraps many C functions from the Numpy C-API for Numpy array manipulation & attribute access. To review the full list of

PyArrayObject
procs that Pymod provides, browse the Pymod source file
pymodpkg/pyarrayobject.nim
.

Here are some of the Numpy array attributes that Pymod exposes:

  • .data
    (returns
    pointer
    )
  • .data(T)
    (returns
    ptr T
    )
  • .descr
  • .dimensions
  • .dtype
  • .nd
  • .ndim
    (an alias for
    .nd
    )
  • .shape
    (an alias for
    .dimensions
    )
  • .strides

Here are some of the Numpy functions for array creation & manipulation that Pymod wraps:

  • createSimpleNew(dims, npType)
  • createNewCopyNewData(oldArray, order)
  • copy(oldArray)
    (an alias for
    createNewCopyNewData
    )
  • createAsTypeNewData(oldArray, newType)
  • doCopyInto(destArray, srcArray)
  • doFILLWBYTE(destArray, val)
  • doResizeDataInplace(oldArray, newShape, doRefCheck)

PyArrayIter types

Note that, due to its typeless Pythonic origin,

PyArrayObject
is not a Nim generic type. So the element data-type of a
PyArrayObject
instance is unknown to Nim. The Nim code must specify the correct element data-type for the
PyArrayObject
elements. The preferred method of accessing the (appropriately-typed) elements of a
PyArrayObject
instance is to use one of the two supplied
PyArrayIter
types:
  • PyArrayForwardIter[T]
    , a C++-style Forward Iterator
    • returned by
      .iterateFlat(T)
    • can only be incremented & dereferenced
    • the fastest & safest iteration style
  • PyArrayRandAccIter[T]
    , a C++-style Random-Access Iterator
    • returned by
      .accessFlat(T)
    • can be incremented or decremented by any integer; offset (using
      +
      or
      -
      ) by any integer; indexed by any integer; & dereferenced
    • basically a C pointer with bounds-checking

Both of the

PyArrayIter
types offer 1-D iteration & indexing over a "flat" interpretation of the Numpy N-D array data. These two iterator types are inspired by the C++ iterator category model. By default, the iterators implement per-dereference bounds-checking. This bounds-checking can be disabled, as described above in the section Per-project configuration.

Note that the

PyArrayIter
types can't handle any of the following usage scenarios:
  • non-C-contiguous array data
  • strides
  • multi-dimensional indexing

If you attempt to iterate over a Numpy array with non-C-contiguous data, an

AssertionError
will be raised (even in release mode). If you supply the incorrect array element data-type when invoking
.iterateFlat(T)
or
.accessFlat(T)
, an
ObjectConversionError
will be raised (even in release mode).

PyArrayObject & PyArrayIter usage example

Here is a simple example of how to use

PyArrayObject
&
PyArrayForwardIter[T]
:
import strutils  # `%`
import pymod
import pymodpkg/docstrings
import pymodpkg/pyarrayobject

proc addVal*(arr: ptr PyArrayObject, val: int32) {.exportpy} = docstring"""Add val to each element in the supplied Numpy array.

The array is assumed to have dtype int32; otherwise, a ValueError will be raised. The elements in the array will be modified in-place. """ let dt = arr.dtype echo "PyArrayObject has shape $1 and dtype $2" % [$arr.shape, $dt] if dt == np_int32: let bounds = arr.getBounds(int32) # Iterator bounds var iter = arr.iterateFlat(int32) # Forward iterator while iter in bounds: iter[] += val inc(iter) # Increment the iterator manually. else: let msg = "expected array of dtype $1, received dtype $2" % [$np_int32, $dt] raise newException(ValueError, msg)

initPyModule("_myModule", addVal)

You can test the Pymod-wrapped Nim proc

addVal
using a Python script like this:
import numpy as np
import _myModule as mm

int32arr = np.arange(10, dtype=np.int32).reshape((2, 5)) print(int32arr) mm.addVal(int32arr, 101) print(int32arr)

print("")

float32arr = np.arange(10, dtype=np.float32).reshape((2, 5)) print(float32arr) mm.addVal(float32arr, 101) # Uh-oh! Our addVal proc wants an array with dtype == np.int32! print(float32arr)

The output from running this script will look something like this:

[[0 1 2 3 4]
 [5 6 7 8 9]]
PyArrayObject has shape @[2, 5] and dtype numpy.int32
[[101 102 103 104 105]
 [106 107 108 109 110]]

[[ 0. 1. 2. 3. 4.] [ 5. 6. 7. 8. 9.]] PyArrayObject has shape @[2, 5] and dtype numpy.float32 Traceback (most recent call last): File "test_addvalmod.py", line 13, in mm.addVal(float32arr, 101) # Uh-oh! Our addVal proc wants an array with dtype == np.int32! ValueError: expected array of dtype numpy.int32, received dtype numpy.float32 Nim traceback (most recent call last): File "pmgen_myModule_wrap.nim", line 26, in exportpy_addVal File "addvalmod.nim", line 22, in addVal

PyArrayIter loop idioms

Observe the

while
loop that was used in
addVal
to iterate over the array. This is the most flexible loop idiom for forward-iterating over an array, since you are able to control where, and how many times, the forward iterator will be incremented within the body of the loop:
let bounds = arr.getBounds(int32)  # Iterator bounds
var iter = arr.iterateFlat(int32)  # Forward iterator
while iter in bounds:
  iter[] += val
  inc(iter)  # Increment the iterator manually

However, this

while
loop idiom is sometimes more verbose than it needs to be. Often, you only need to increment the forward iterator once per iteration, at the end of the body of the loop. If this is all you need, there are 4 iterators that you can use, which enable shorter
for
loop idioms:
  • values(arr, T) -> T
  • items(PyArrayIter[T]) -> T
  • mitems(PyArrayIter[T]) -> var T
  • iitems(PyArrayIter[T]) -> PyArrayIter[T]

For example, if you need to modify the items in the array:

for iter in iitems(arr.iterateFlat(int32)):  # `iitems(iter)` iterator yields a PyArrayIter[T].
  iter[] += val

or:

for mval in mitems(arr.iterateFlat(int32)):  # `mitems(iter)` iterator yields a mutable value.
  mval += val

If you don't need to modify the array data at all, there are even shorter

for
loop idioms that yield a succession of (read-only) array values:
var maxVal: int32 = low(int32)
for v in arr.values(int32):  # This uses the `values(arr, T)` iterator.
  if v > maxVal:
    maxVal = v

or:

var maxVal: int32 = low(int32)
for v in arr.iterateFlat(int32):  # This uses the implicit `items(iter)` iterator.
  if v > maxVal:
    maxVal = v

Likewise for

PyArrayRandAccIter[T]
, there are several loop idioms, which offer different levels of control & convenience. For the most flexibility, use a
while
loop:
let bounds = arr.getBounds(int32)  # Iterator bounds
var iter = arr.accessFlat(int32)  # Random access iterator
while iter in bounds:
  iter[0] += val  # or equivalently: iter[] += val
  inc(iter, incDelta)  # Increment the iterator manually

There are multiple different

for
loop forms available for
PyArrayRandAccIter[T]
, to make it convenient to iterate over C-contiguous N-dimensional arrays. If you want to visit every iterator position in turn, but retain the ability to index/offset arbitrarily, use the 1-argument form of
accessFlat
with the
iitems()
iterator:
for iter in arr.accessFlat(int32).iitems:
  iter[0] += val  # or equivalently: iter[] += val

If you want to modify mutable values, but you don't need arbitrary index or offset capability, you can use the

mitems()
iterator:
for mval in arr.accessFlat(int32).mitems:
  mval += val

If you want to increment by a certain specific delta each time, use the 2-argument form of

accessFlat
with any of the
items()
,
mitems()
or
iitems()
iterators:
for iter in arr.accessFlat(int32, incDelta).iitems:
  iter[] += val

or:

for mval in arr.accessFlat(int32, incDelta).mitems:
  mval += val

And finally, if you want the iteration to begin at a certain initial offset, then increment by a certain specific delta each time, use the 3-argument form of

accessFlat
:
for iter in arr.accessFlat(int32, initialOffset, incDelta).iitems:
  iter[] += val

or:

for mval in arr.accessFlat(int32, initialOffset, incDelta).mitems:
  mval += val

For example, if you want to visit just the "green" channel of an RGB image, you might use a loop like this:

let greenIdx = 1
let numChans = img.shape[2]
for g in img.accessFlat(uint8, greenIdx, numChans):  # implicit `items(iter)` iterator.
  processGreenComponent(g)

These code examples are all available in full in the examples directory.

Tips, warnings & gotchas

Here are some helpful hints about a few sharp edges of Pymod (some of them due to sharp edges in Nim that we haven't been able to cover over completely) that can trip you up (and then confuse you with obscure compiler error messages):

  • If you want to
    exportpy
    a proc using Pymod, don't give your proc the same name as the Nim module that contains the proc (or in fact, the same name as any other procs in that same module).
  • If you want to
    exportpy
    a proc using Pymod, ensure that the proc is also exported in Nim by marking it with an asterisk after the proc-name.

What about calling Python from Nim?

Pymod enables you to wrap your Nim procs so they can be called from Python.

If instead you want to call Python functions (maybe even the interpreter) from Nim (ie, the control flows in the opposite direction), Pymod is not what you're looking for.

In a situation like this, python.nim or the NimBorg project might be what you're looking for.

Implementation details

We want to make existing Python types (and extended types like Numpy arrays) available in Nim procs. The idea is that objects of these Python types will be created in the existing Python code, then passed through to Nim procs for processing, and then the results will be returned to the Python code for the pre-existing processing or viewing in Python.

For this reason, we decided not to use the Python ctypes module; the focus of

ctypes
seems to be on propagating types in the opposite direction. Instead of making fully-fledged Python types easily available in C,
ctypes
wraps lowest-common-denominator C types and structs for basic access in Python.

The way ctypes handles pointers & structs requires you to build your object types up in terms of C primitive types, so they can be accessed as method-less, field-only objects in Python. But the Python types we want to use in Nim are already defined in Python! So

ctypes
would be more applicable to us if we wanted to make existing Nim data-structures available in Python, rather than the other way around.

On an initial reading, Python CFFI & cffiwrap appear to be more useful for exposing existing Python types in C than

ctypes
is.

However, for the initial implementation of Pymod, we chose the CPython C-API & Numpy C-API over

cffi
and/or
cffiwrap
. We reasoned that Nim is going to produce & auto-compile its own C code anyway, so if we generate C-API C code for our Python integration, this C code can be compiled & linked into a shared library by the Nim compiler itself -- thus ensuring that the same compiler settings are used for both the compilation & linking stages.

We are considering implementing an additional

cffi
back-end for Pymod in the future. This would be a significant step towards compatibility with the PyPy Python interpreter.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.