Not sure about your comparison specifically, but in general the hard edges in comics should be an ideal case for AV1's spatial prediction. JPEG-XL eschewed that in favor of being inherently progressive; its splines promised to make up some of the difference but last I checked the encoder still doesn't use them.
Categorically, this is the main reason why even the main JPEG-XL dev agrees that AVIF is currently better with non-photo content [1]
I observe that it depends on which quality you aim to. At low quality AVIF does a good job with line drawings. I suspect much of it is because of the 8-color local palette mode. When you raise the quality/size a tiny bit (to need 9+ unique colors in an area), JPEG XL does an equal job with the edges, but starts getting all the subtleties, lonely faint dots, noisy or weak textures right, whereas it can be impossible to convince an AVIF encoder to store them at any quality setting.
Cartoon/anime collectors seem to be passionate about image quality and they may have observed the same as I have. At least I have learned a lot about image quality from anime fans during my image compression career.
Web devs don't care that much about quality, they are more looking into creating monetary savings in bandwidth and sometimes trying to lower the latencies. E-commerce is yet another thing, where the image quality may turn into revenue and becomes important again.
Categorically, this is the main reason why even the main JPEG-XL dev agrees that AVIF is currently better with non-photo content [1]
[1] https://twitter.com/jonsneyers/status/1550161314961555457