I very much like the concept. But I dislike the "magic" feel of overloading "import". Especially as it creates namespace collisions, eg when you have python module and object file with same name. Imho something like "bitey.import_obj(name)" would have been nice and more clear.
I wonder how complicated it would be to parse header files to populate field names of structs automatically? Maintaining separate .pre.py and .h files seems like recipe for trouble.
To parse headers for field names would require a complete C preprocessing and parser. That wouldn't be a problem for this author (who wrote a very popular parser generator for Python), but it still wouldn't be perfect until it completely replicated the system compiler's behaviour with respect to system headers (consider conditional compilation). It is particularly annoying if the host and target systems are different, i.e. in cross compilation. I've tried this exact thing (header parsing to get type information) and it is quite a pain to get it right.
Maybe you could use LLVM/Clang to parse the headers. That way at least the parsing would be consistent with the compilation. I don't know how difficult it would be to access such type information, but I thought the whole point of LLVM/Clang was to be usable as a library.
I'll be interested to see how this compares to numba and how easy it is to fit it into the scientific python ecosystem. I'm also curious about performance versus numba and my current go to solution, cython.
It would be hard in the presence of parametric polymorphism (multiple functions having the same name but different numbers and types of parameters), as well as the difference between C++'s and Python's method resolution semantics (in Python, for instance, everything is virtual).
That's not parametric polymorphism, that's function overloading. The former makes it possible to use the same function implementation on different types, whereas the latter is about using the same name for different functions.
Thanks for the reply. I just spent the last 3-4 days hacking around osgswig, having a good (and simple) binding solution for C++ / python is still an open problem ...
While the mapping between C and LLVM IR is rather straightforward, things are more complex with C++. C++ constructs get dismantled to be compiled to IR, and the binding generator would have to restore all the C++-ish information from metadata and type info, which isn't easy.
It seems cool but why would you use it instead of having a c lib (or c obj file) interfaced with ctypes or swig ... ? Maybe I miss what LLVM brings ?
Thanks
I addition to cdavid's comment: LLVM IR is system independent. So if you had a deployment over heterogeneous machines you could write the extension code once and have it run everywhere without recompilation.
It's not portable enough to let me take IR generated for my x86 laptop and run it on my arm board, in general, even though both platforms are ILP32, little-endian, and running the same OS.
Yes, but this is a problem with the source language and/or the host system libraries, not with LLVM IR itself. There is a broad domain of applications for which it is portable.
once you can import llvm code at runtime, you are pretty close to injecting llvm code at runtime: you write some LLVM IR in a string, and "llvm_eval" it.
LLVM performs its optimization passes on the LLVM bitcode itself (the "middle-end" of the compiler), before finally translating the optimized bitcode into machine-specific binary code.
Not an LLVM expert though, I could be glossing over a few details.
I wonder how complicated it would be to parse header files to populate field names of structs automatically? Maintaining separate .pre.py and .h files seems like recipe for trouble.