In this example the dichotomy is between String (which is guaranteed by the type system to be valid UTF-8) and OsStr (which might be in an unknown encoding or otherwise not decodable to valid Unicode).
This is exactly when you want a systems language to require explicit conversions, rather than converting things silently and possibly losing or corrupting data.
I understand the difficulty in this space; much of it is caused by forcing the Windows unicode filesystem API onto python as its world-view, rather than sticking to the traditional Unix bytes world-view. I'm unixy, so I'm completely biased, but I think adopting the Windows approach is fundamentally broken.
The problem there is overblown - it's basically all due to the idea that sys.stdin or sys.stdout might get replaced with streams that don't have a binary buffer. The simple solution is just not to do that (and it's pretty easy; instead of replacing with a StringIO, replace it with a wrapped binary buffer). Then the code is quite simple
import sys
import shutil
for filename in sys.argv[1:]:
if filename != '-':
try:
f = open(filename, 'rb')
except IOError as err:
msg = 'cat.py: {}: {}\n'.format(filename, err)
sys.stderr.buffer.write(msg.encode('utf8', 'surrogateescape'))
continue
else:
f = sys.stdin.buffer
with f:
shutil.copyfileobj(f, sys.stdout.buffer)
Python's surrogateescape'd strings aren't the best solution, but I personally believe that treating unicode output streams as binary ones is even worse.
This is exactly when you want a systems language to require explicit conversions, rather than converting things silently and possibly losing or corrupting data.