I wonder, if assembling an array of paths for outlines may improve things. You can check for a point being in inside path, use paths as hit areas, and you can apply fills. Notably, there wouldn't be any need for keeping various mask images in memory anymore. (You wouldn't want any curves in this paths, just pixel outlines. It may become a bit tricky with complex paths and negative shapes, though, which may be addressed by composing a mask image on-the-fly.)
a potential drawback is that it seems like the paths could be several times larger than the original image (consider a checkerboard pattern, where each pixel is its own fill area, or the pathological spiral pattern i suggested in a comment below)