Java really sucks for this kind of work. Auto-boxing makes it really easy to accidentally end up with Objects which will kill performance. Also, Java doesn't give you good access to bit manipulation, processor intrinsics, or other useful low level tools.
Depends what version you're using - I think Java 17 has improved vector maths capabilities, or it might still be in the incubator (but useable). I haven't looked at the API to see what it gives you though.