Alternatively, if you're being confused by the quoted intermediate DCT precision, that's not relevant to the final output either. You cannot implement any transform, DCT especially, without having intermediate values with a greater range than the input or output. Like, even a simple average of two 8-bit values (a+b)/2 has an intermediate range of 9 bits.
The JPEG spec does specify both 8 and 12 bit sample precision for lossy, but I don't think anyone ever implemented 12-bit since libjpeg never cared about it.
Libjpeg does have support for it. Unfortunately, you have to have two copies of the library, one for 8 bit and one for 12. And you have to rename all the API methods in one of the libraries so you dont get name collisions. I believe that LibTIFF has a build configuration for this so that 12bit JPEG data can be encapsulated in a TIFF file.
The JPEG spec does specify both 8 and 12 bit sample precision for lossy, but I don't think anyone ever implemented 12-bit since libjpeg never cared about it.