Technically speaking, this may be a case of undefined behavior. From my man page:
-u, --unique
Unique keys. Suppress all lines that have a key that is
equal to an already processed one. This option, similarly
to -s, implies a stable sort. If used with -c or -C, sort
also checks that there are no lines with duplicate keys.
...
-n, --numeric-sort, --sort=numeric
Sort fields numerically by arithmetic value. Fields are
supposed to have optional blanks in the beginning, an
optional minus sign, zero or more digits (including
decimal point and possible thousand separators).
When you use -n, without a key fields specification, the whole line does not meet the requirement for numeric sorting.
This sort does give me the intended output:
$sort -k1,1n -k2 -u ~/tmp/sort.txt
1 a
5 which
10 exotically
15 aerodynamically
15 differentiation
20 electroencephalogram
Whether or not deduplication on keys is ideal behavior, it is what is specified here. What is not explicitly specified is what is considered to be the key when you try to sort a non-numeric line numerically.
This is the sort of problem that you get with duck typing: it does what you expect and intend, except in those corner cases where it doesn't.
Excellent, but now I realise there are repeated lines, and I need to de-duplicate. So I use sort -u to do that
I would just pipe it to uniq, the ultimate solution proposed --- because that seems to make more sense to me. I have not ever used the '-u' option of sort before, nor would I have expected it to have such an option (sort is for sorting, not removing duplicate lines.) Maybe because I'm more used to the "UNIX philosophy" instead of the GNU one?
This sort does give me the intended output:
Whether or not deduplication on keys is ideal behavior, it is what is specified here. What is not explicitly specified is what is considered to be the key when you try to sort a non-numeric line numerically.This is the sort of problem that you get with duck typing: it does what you expect and intend, except in those corner cases where it doesn't.