> I think it could be a problem in the future if the builtin userdiff drivers started growing more invasive options, like automatically claiming to be non-binary (i.e., setting diff.cpp.binary = false by default). In other words, I think we have two options:
> 1. Builtin drivers like "cpp" can stay minimal, only setting funcname and color-words headers that aren't going to produce terrible results if we are wrong about detecting by extension.
> 2. We force the user to identify file types manually, so we can't be wrong. The "cpp" diff driver means "you are a text C file", and if a user mis-marks a binary file with that diff driver, they are the one who is wrong.
> So if it's an either/or situation, we should decide not only that extension auto-detection is a good feature, but that it trumps adding more advanced features to the builtin drivers in the future.
> Or we could decide that the extensions really are good enough, and if you really do have binary files named "foo.c", it's your problem to override the defaults with "*.c -diff".
There might be more recent discussion that I didn't find.
I rarely see files with incorrect file extensions outside of collisions and user error?
I don’t see any downside apart from extensionless files like shell scripts with the executable flag set, shebang, and no file extension which could also be solved.
Edit: and if comprehensive correctness was truly desired, file magic could make a better guess, but that would be overkill imo.
I found a 2011 patch/proposal to make it the default, which appears to have stranded: https://lore.kernel.org/git/20110825204047.GA9948@sigill.int...
With some discussion by the patch's author of possible downsides (https://lore.kernel.org/git/20110826025913.GC17625@sigill.in...):
> I think it could be a problem in the future if the builtin userdiff drivers started growing more invasive options, like automatically claiming to be non-binary (i.e., setting diff.cpp.binary = false by default). In other words, I think we have two options:
> 1. Builtin drivers like "cpp" can stay minimal, only setting funcname and color-words headers that aren't going to produce terrible results if we are wrong about detecting by extension.
> 2. We force the user to identify file types manually, so we can't be wrong. The "cpp" diff driver means "you are a text C file", and if a user mis-marks a binary file with that diff driver, they are the one who is wrong.
> So if it's an either/or situation, we should decide not only that extension auto-detection is a good feature, but that it trumps adding more advanced features to the builtin drivers in the future.
> Or we could decide that the extensions really are good enough, and if you really do have binary files named "foo.c", it's your problem to override the defaults with "*.c -diff".
There might be more recent discussion that I didn't find.