I actually had to work with Python strings at the C level recently, and their ap...

I actually had to work with Python strings at the C level recently, and their approach is pretty clever. IIRC, the runtime can take any common form of Unicode, and will store it. When you access that string, the accessor requests a specific encoding, and the runtime will convert if need be, and then store it in the string object.

So it handles the (very) common case of needing the same encoding multiple times (e.g. for all file paths on Windows), while not introducing too much overhead in memory or speed.

I could be mistaken on exact details though, especially since I recall there being multiple implementations even within py3.x.