5 makes perfect sense to me; the author's complaints seem kinda silly.
An area this makes sense is, what do you expect to get if you do something like:
emoji = " "
print(emoji[:3])
Should this throw an error because there's only one displayed "character"? Should it return only a partial codepoint by returning only the byte data for the first 3 bytes?
Modern strings are complex objects that have evolved a bit past char[] or byte[].
> Strings are just an array of unicode codepoints rather than "characters", so all I'm doing is asking for the first three of those codepoints.
"Ice trays are just a pile of molecules rather than "cubes", so all I'm doing is separating those molecules", he states as he activates the igniter.
> Substring is a broken operation? What's the justification for that idea?
You take a thing and you mangle beyond recognition without regards for its purpose or meaning. That's like considering the jaws of life a normal part of opening a door to take a piss at work.
I think this is where the misunderstanding comes in. Python doesn't treat strings as char[] but as essentially unicode_codepoint[].
Whether this is a good idea on the whole is debatable, there's even a full PEP talking about the security concerns around doing it this way[1].
However, given this is how it works, the behaviour displayed makes complete sense to me and is the best of the bad choices presented by needing multi-byte strings.
An area this makes sense is, what do you expect to get if you do something like:
Should this throw an error because there's only one displayed "character"? Should it return only a partial codepoint by returning only the byte data for the first 3 bytes?Modern strings are complex objects that have evolved a bit past char[] or byte[].