Thanks for the feedback! This is also something I noticed, I might try the newer models that are currently comming up. I also think the approximate nearest neighbor search might introduce some errors which is probably a good idea to check. It also really helps knowing the whole VQGAN+CLIP space, since you can optimize query strings in a similar way. For example prepending "a picture of" or adding some qualifier at the end like "unsplash" for a particular style.
An FAQ is indeed a good idea!