> Python 3 requires you to do something more complicated when crap comes in. Or ...

zzzeek · on May 12, 2014

> (Or show me your repo and I show you all the bugs you now have)

this would be great. Show me! I'd love to know:

https://bitbucket.org/zzzeek/sqlalchemy/

I'm guessing you'd go for Mako first since it has the most unicode intense stuff going on (and it uses lots of your code).

the_mitsuhiko · on May 12, 2014

As an example mako cli. You can call this an error or not, but with C locale your cmdline will die with UnicodeErrors when you open a non existing file with unicode filename on Python 3 but not so on Python 2 where it will do the correct thing. It will also die with unicode errors under the same situation when your template renders any unicode characters. Again, something that probably works fine on python 2 and correctly.

Or if you would put unicode characters into your README.rst you could no longer safely install mako. Again, Python 3 only.

These are just two things I found on github.

Another easy one: alembic README's now no longer can safely contain unicode. They would break on Python 3, but work just fine on Python 2 because of the code in list_templates.

zzzeek · on May 12, 2014

the cmdline template runner at the moment isn't doing unicode in Py2K either, crashes there too.

wbond · on May 12, 2014

I would not be surprised if you can construct contrived examples of how Python 3 can be broken. In my experience, writing real life code, I ship more stable software writing in Python 3 than Python 2.

I mostly work with subprocesses or directly reading data from socket connections, and I run all of my bytes through strict mode. If something doesn't decode properly, an error is returned. Currently I am working on an interactive way (inside of Sublime Text) to present to the user a way to see text in different encodings so they can help debug the issue on their own.

So, yes, you need to write helper functions and have an interface to deal with properly handling encodings. This has been my experience in every language I've ever worked in. I can't imagine there is a way around it. Is this a reason Python 3 sucks compared to 2? Not in my experience. I had far more issues in Python 2 with encodings and not being sure what other libraries and packages had done in regards to handling unicode data. Hmm, so ftplib accepts unicode for filenames. Does it encode it? What encoding does it use? Oh, look at that, it has just been coercing to ascii because it can.

So yeah, writing a simple little toy command line app needs more boiler-plate to deal with unicode. Any real app is going to need that and a ton more. And you are going to have to decide how to error with encodings, and how to let users identify encodings. And you are going to need to write a global exception handler for Python to capture unexpected exceptions and log them to a file so users can send crash reports. Yay, sys.excepthook!

But anyway, I think it all comes back to the fact that I know what I am dealing with far more quickly with Python 3 than with 2. Again, maybe because I don't write apps that deal with local file paths (expect abstracted through a subprocess).

Unfortunately, most of the code where I deal with crappy encodings from FTP servers and SVN is closed source. The open source stuff is at https://github.com/wbond.

muyuu · on May 13, 2014

> In my experience, writing real life code, I ship more stable software writing in Python 3 than Python 2.

Real life code is not the same for everybody.