Hacker News new | past | comments | ask | show | jobs | submit login

... If you're willing to write your own extension modules. The problem with pypy is that there is a multitude of important libraries using the CPython API.



And a significant chunk of these important extension libraries are supported in pypy using the emulated C-extension api (cpyext)


Right, the performance cost of cpyext is what we're contrasting with ctypes-like approaches like LuaJIT's FFI in this thread. In https://news.ycombinator.com/item?id=42656395 Antonio Cuni linked the standard explanation of why cpyext is so slow and also HPy, which I'm embarrassed to say I didn't know about.


Yes, and those libraries mostly don't exist for Lua. It's a big reason to use Python instead of Lua, and to use CPython rather than much better implementations like PyPy, but not much of a reason to use PUC Lua instead of LuaJIT.

On the other hand, there are also a multitude of important libraries using the C ABI, and, as you said, you can call those C libraries pretty easily with the LuaJIT FFI, without "writing extension modules". This is a big reason to use Lua instead of Python, as long as you can use LuaJIT.

Here's an example of the activity you're describing as "writing an extension module". Let's imagine that we have a garbage file we want to delete, and for some reason we're trapped in Lua, so we have to "write an extension module" to invoke unlink() from libc and call it:

    $ touch garbagefile
    $ luajit
    LuaJIT 2.1.0-beta3 -- Copyright (C) 2005-2022 Mike Pall. https://luajit.org/
    JIT: ON SSE3 SSE4.1 BMI2 fold cse dce fwd dse narrow loop abc sink fuse
    > ffi = require 'ffi'
    > ffi.cdef 'int unlink(const char *pathname);'
    > libc = ffi.load '/lib/x86_64-linux-gnu/libc.so.6'
    > =libc.unlink
    cdata<int ()>: 0x7ff25ed39a00
    > libc.unlink 'garbagefile'
    > 
    $ ls -l garbagefile
    ls: cannot access 'garbagefile': No such file or directory
That took literally three lines of code and less than two minutes. You can call that "writing an extension module" if you want, but I think that phrasing is really misleading; the impression it gives of what we're talking about is pretty far from the truth. It's like when I wired two RJ-45 jacks together crossing over the appropriate pairs for a 10BaseT null modem and said I'd built a "low-power full-duplex Ethernet switch".

This works for any library, not just libc. Let's see what version of libcdparanoia I think I have installed:

    > ffi.cdef 'extern char *cdda_version();'
    > cdda = ffi.load '/usr/lib/x86_64-linux-gnu/libcdda_interface.so.0'
    > =ffi.string(cdda.cdda_version())
    10.2
As a more extended example, take a look at https://gitlab.com/kragen/bubbleos/-/blob/master/yeso/yeso.l..., a binding I wrote for a C library I'd written without giving any thought to Lua. Basically I copied and pasted the relevant sections from my .h file into the Lua code and added a few lines of Lua to load the relevant shared library:

    local yeso = ffi.load(sodir .. lib)
And then the C functions defined in the .so and declared to the LuaJIT FFI were directly callable as properties of that `yeso` table, like `yeso.yw_wait`, `yeso.yw_close`, etc. There's another couple of pages in that .lua file but it's just a simple, convenient OO façade over the procedural-style C interface. Plus defining some constants from the .h file.

Can't you do the same thing in Python with `ctypes`? Well, kind of. I mean, I did! But it's a huge pain in the ass, and the result is still worse. Contrast https://gitlab.com/kragen/bubbleos/-/blob/master/yeso/yeso.p..., which provides a more limited binding to the same API in the same way. For example, here's the definition of `ypic` from yeso.h:

    typedef struct { ypix *p; yp_p2 size; int stride; } ypic;
And here's the definition of `ypic` in yeso.lua:

    typedef struct { ypix *p; yp_p2 size; int stride; } ypic;
I literally just copied and pasted the C. LuaJIT's C parser parses this at runtime. (Then at https://gitlab.com/kragen/bubbleos/-/blob/master/yeso/yeso.l... I added some methods to it, which is something you can't do with `ctypes`; you have to make a separate wrapper class. But in a sense those are just syntactic sugar.)

Now, here's the definition of `ypic` in yeso.py:

    class ypic(Structure):
        _fields_ = [
            ('p', POINTER(ypix)),
            ('size', yp_p2),
            ('stride', c_int),
        ]
It's a lot more work for much less return. It's not just that it's more verbose; there are also many more opportunities to screw up the types in a subtle way, and then instead of an exception traceback you get a core dump to debug with GDB. It's still better than using CPython's shitty PyObject API, but it's not in the same league as LuaJIT.

I don't want to come off as too positive on Lua here; I think that as a language it has several fatal flaws. (I wrote in more detail on this two weeks ago at https://news.ycombinator.com/item?id=42519070.) But being able to invoke native code is actually one of its strong points.


The equivalent of this (and strongly inspired by luajit's ffi) in the python world is cffi, btw: https://cffi.readthedocs.io/en/stable/


Oh, thanks! That's a second thing in this thread I'm embarrassed to have not known already. Does it get native-like performance in PyPy the way LuaJIT's FFI does? I'll have to try it with Yeso to see if it's an improvement.


It should get pretty good performance yes. Not sure how native like we get with the jit. Gut feeling would be a bit slower than gcc -O0? I would be very interested in your experience if you do try it.


Thanks! I guess now I've assumed the obligation. Probably I should look for you on Mastodon in order to tell you about my experience?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: