Doesn't this say something about the language if it's been so difficult for people to migrate from 2 to 3? Yes, it was a major version change, but it doesn't bode well for arguments that Python is a good language to do long-term, maintainable, and large-scale development. Sure, now things are "settled" with 3, but Python continues to get more features. It just doesn't look good for the language if even doing a language version migration was a literal brick wall for so much of the community.
Let me be blunt here: Python sucks, but that's at the level of 'programming languages people complain about and programming languages nobody uses'. However the Python libraries absolutely rock and the ease with which you can get really performant number crunching code out of what is nominally an interpreted language is amazing. Statistics, machine learning, engineering, the notebooks etc, I would not pick anything else for work like that, it's a near infinite box full of powertools.
But it would be a mistake to consider Python anything but luxury glue no matter how much you've used it for production code, it isn't and never will be a bullet proof language, there are just too many ways in which the runtime can surprise you and the long term maintainability of any Python codebase is always going to be questionable, in part because you will most likely have dependencies on stuff that will silently disappear or change out from under you while you're not watching.
I've been writing Python code pretty much since it was first released and there isn't a single piece of code that I ever put together that worked longer than a few years out of the box without some breaking change. This as opposed to just about every other language that I've used, which I find very frustrating because Python could be just about perfect. And I'm still sore about significant whitespace ;)
Funny, for about every point I feel the opposite way :-). Python is an elegant language that hits the sweet spot where it is very expressive, and easy to do what you want, and at the same time has enough structure so you don't produce a mess all the time.
The libraries are very powerful, but a lot of them are hacks. Numpy and Pandas totally rely on magic and overloading the array indexing operator. You can't express a problem the natural way or it will be slow, you have to think the numpy way. I find it especially confusing if you create expressions with numpy arrays. Am I operating on them element wise, or creating a "cross product", or am I creating some kind of magic indexing object? Matplotlib is also annoying, there are so many ways to do things and half of it is object oriented, while half of it uses global state.
One thing I like about Python are the web frameworks (Django, Flask). Async is also good, although it took almost 10 years for vanilla Python to catch up with what Twisted already had.
About code breaking I also have the opposite experience. I wrote a little Gtk app in the late 2000s and it worked with little or no changes for many Python and Gtk versions, thanks to the dynamism of Python. That would be completely impossible with a compiled language.
Interesting, and nice to have a contrasting datapoint.
I do think that "The libraries are very powerful, but a lot of them are hacks. Numpy and Pandas totally rely on magic and overloading the array indexing operator. You can't express a problem the natural way or it will be slow, you have to think the numpy way. I find it especially confusing if you create expressions with numpy arrays. Am I operating on them element wise, or creating a "cross product", or am I creating some kind of magic indexing object? Matplotlib is also annoying, there are so many ways to do things and half of it is object oriented, while half of it uses global state." is an illustration of my point: the language itself isn't powerful enough to do the job so you rely on a lot of libraries written in different languages to glue it all together.
An 'elegant' language would provide a way to do so without all of these leaky abstractions.
To me Clojure is such a language, but it doesn't nearly have the kind of ease of use that Python has.
> is an illustration of my point: the language itself isn't powerful enough to do the job so you rely on a lot of libraries written in different languages to glue it all together.
The language is 100% capable. Matplotlib's API is horrible and confusing because one of their design goals was/is to be similar to matlab (that's where the "mat" in it's name comes from) to allow people who are comfortable with matlab to switch more easily.
Matlab's plotting facilities are a mess of global state, so matplotlib has copied it.
> An 'elegant' language would provide a way to do so without all of these leaky abstractions.
It would have to do something else, first: Exist.
In 10+ years of programming, I've yet to ever see a implementation of a complex abstraction that was comprehensive and complete enough that I didn't find a corner case requiring me to reach through it and deal with platform specific something-or-other.
You are right, you can't separate the language and the libraries, and a language would profit if it had a way to express these abstractions better. But if you'd that kind of syntax the Python language I think it would become bloated.
I think I like Python, because coming from C, a lot of things that are tedious there are effortless in Python (Strings! Lists and dicts, libraries, and so on.)
I came from C as well and even though Python does make strings, dicts and so on easier the main difference between the two (compiled vs interpreted) is exactly why you get all those libraries in the first place: it is super hard to get good performance out of an interpreted language if you want to crunch numbers.
That's why each and every number crunching problem will eventually make use of the various escape hatches to call libraries written in a language that is performant.
As for strings: BASIC also had strings. That doesn't mean that having strings (or even dicts) is what makes a language elegant.
The problem really is that there are only so many sweet spots for programming languages and usually those come with limitations with respect to the domain you want to use them for. Pick any two: expressive, fast, easy to use.
NumPy/matplotlib make perfect sense for Matlab users (it's basically a port of Matlab semantics). If Matlab were free and didn't suck so much as a general purpose language, I wonder to what extent Python would have been adopted in scientific computing (Matlab is still pretty strong in some communities, to be fair).
Very strange, about the whitespace. I write C++ most of the time and I love significant whitespace. I like the combination of power and (...these days, relative...) simplicity of the language for all kinds of helper tools. Probably wouldn't want to use it to write something that is large and / or needs to run fast.
Ive always found whitespace to be like variable names, restriction coming from the language always seems to make it harder to use them to convey things to the next person who comes along.
The code typesetting is for the humans not the machines.
The best thing about significant whitespace is the reduction in pernickety code review comments about it compared to other languages. If you also agree to "just run black" to format code then that kind of bullshit drops to almost 0 in a team.
But if you agree on a certain formatting tool anyway, then you don't need significant whitespace to get that. That's one of the things I like most about Go. It shipped with gofmt from day one, so literally no one ever argues about coding style.
People totally do argue about go coding style. Since gofmt doesn’t do line breaks, where to break long lines is a constant struggle. You also have stricter formatters like gofumpt addressing certain problems not handled by gofmt, but the long line problem is still up in the air AFAIK.
The bad thing about significant whitespace is it breaks copy-pasteability of code, with no way to get broken formatting back without understanding what the code does.
That is a lot worse than formatting squabbles, especially for a glue language where you are expected to be able to copy snippets for short scripts all the time.
That and even with version control you will always have to keep an eye out for accidental deletion of the last bit of whitespace on any code block. It's a pretty easy mistake to overlook without context.
My biggest pain point with Python is C / Rust dependencies. So, not really Python per se, as I suspect any language relying on C ABIs will hit the same issues.
I started a project in 2016 and used pipenv to freeze all the dependencies. A few weeks ago, I could not get it working on a new machine. Apparently the new system python broke some of the libraries. I had also pinned the python version in pipenv, but it seems like pipenv can only install python versions that are not too old.
I really like the language, but all the tooling is seriously broken.
If you upgrade the runtime / compiler, you'll run into changes of behaviour. This really applies to almost every implementation out there. This applies across the board with large enough codebase.
Pipenv has pyenv integration. It just doesn't work with python 3.6 on my machine. My guess is that it tries running pipenv itself with the old interpreter and is no longer compatible. In any case, faffing around with 25 different third-party solutions is a horrible developer experience.
Fwiw what I've settled on that seems the most stable is to use pyenv as the "outer layer", as it shims the python interpreter along with any venvs so you don't get conflicts / heisen-incompatibility. Lets you use anaconda envs as children as well. The real trick, I think, is to pick one system as the canonical master env controller and don't deviate.
I have no problem running stuff from a decade ago either on the same box and installation that it was on a decade ago. But virtual environments and pip freeze are closer to mothballing an old box than displaying long term code compatibility and functionality. Their main reason to exist is as a workaround precisely because there are such issues.
The whole idea that you need a multitude of environments to support applications written in the same interpreted language is where it breaks, that should not have been necessary if backwards compatibility had not been an issue.
Probably not. Java is going through the same exact thing with Java 8 right now. The issue is that humans don't plan ahead for when the software they use inevitably goes EOL, be it programming languages, libraries, or operating systems, and so only really do so when forced (sometimes only after a security incident). In the case of Python 2 and Java 8, the problem was/has been exacerbated by the organization supporting the software continuing to push the support window out further to "give time to upgrade" because in response everyone either said "I have time, I'll get to it later" or "I'll wait for the next newest version".
For Java 8 especially, the later reason is particularly amusing, as Oracle's extended support is now slated to go to 2030 (16yrs after release, for comparison Py2.7 got 10yrs), but Java 21, the next Java LTS that should release next year, will be on Oracle's standard 8 year extended support cycle, and thus extended support only goes til 2031. So why would anyone decide, much less how would one justify to a corporate overlord, to spend all the time and resources to refactor their code, upgrade libraries (assuming the libraries themselves upgrade), update build systems, perform extensive regression testing, and everything else, just for a version that would be EOL one year after what they just upgraded from?
> So why would anyone decide, much less how would one justify to a corporate overlord, to spend all the time and resources to refactor their code, upgrade libraries (assuming the libraries themselves upgrade), update build systems, perform extensive regression testing, and everything else, just for a version that would be EOL one year after what they just upgraded from?
The longer you delay upgrades, the bigger the delta, the more work it will be to update. Instead of doing two medium sized updates, one today and another one in 2030, you'll be doing a huge one in 2030. That might be a trade off you might be willing to make today, but you might regret it in 2029 when you're halfway through a multiyear migration project where all feature work is on hold.
Most Java versions are good about backcompat, almost all the pain comes with Java 9 actually enforcing boundaries and breaking things that had merely been discouraged before (in the name of JDK modularization, which we otherwise aren’t interested in).
The string/bytes encoding lead to some weird runtime problems in a codebase I maintain (the quality of which is probably not that great to begin with). Also changes in non-statically compiled languages will always be a bit tricky. In general, a comprehensive unit test suite is required to step in for the compiler, otherwise there's no confident refactoring. But python is sooooo ubiquitous it's crazy, and most code does not have unit tests to make up for missing compile time checks. The codebase I mentioned earlier certainly doesn't.
I’m pretty cautious about changing my own code, but even I am not paranoid enough to write tests to verify that freakin’ strings still have the behavior I rely on.
I tried adding type hinting to a program that I had to maintain once - because it had no unit tests and I was looking for the quickest way to make it less fragile. One tended to make a small change and then find that it broke something somewhere but finding that out was a long process.
FWIW I found it nearly useless - whereas biting the bullet and refactoring it so that it could be unit tested - even just a few tests - made a world of difference.
> At least not until the day they suddenly didn’t.
That's been happening more and more recently to me. Python 2 really was lax about letting you mix types and get shit done fast, but Python 3 is a different beast entirely. Really opinionated.
The actual migration should you choose to do it was almost trivial. The official conversion script did most if not all the work. The changes were all cosmetic, like "print 42" changing to "print(42)".
I'd say the uproar was about the minorness itself. The syntax had these tiny but otherwise incompatible changes, and for many people, for no good reason. Fo example it was deemed that, as in the example above, print shall not be a statement anymore, but a function instead. Why? Unclear. Meanwhile, Python 3 offered few concrete improvements to many users
So I think the 2->3 moan, of which I was a party, was about that: minor, unnecessary changes that introduced little to no benefit, and yet backwards-incompatible.
> The actual migration should you choose to do it was almost trivial. The official conversion script did most if not all the work. The changes were all cosmetic, like "print 42" changing to "print(42)".
If that were it, it'd have been over a lot sooner. For some packages, bytes vs Unicode strings was bad. For many more, layers of dependencies were an impediment. The trivial stuff wasn't the problem.
But for example a big stumbling block was the scientific programming community, from which I also stem. There were no major syntactical reasons there and yet people were up in arms, for the reasons I describe above.
Most people I know migrated only when python2 literally wouldnt work for them anymore, like they needed a feature of a library only released for python3.
If this was a frequent event then yes that would be an issue. But Python 3 released in 2008, more than 14 years ago now, and since then there have been no sign of such a new big change. While the core developers briefly talked about making Python 3.10 into 4.0 that was scrapped, and currently the plan is just to continue with Python 3.x "forever".
And yet it says a lot that people stayed, I'm not sure it says whether the python 3 move was necessary more than it says something about developers in general.
It wasn’t so much the difficulty, but how little all that effort got you. It took them until about 3.7 to even reach performance parity. It was all stick no carrot.
Any time someone argues for a breaking change that will have oh so many benefits if we just get the whole ecosystem upgraded this one time, it should come with an explanation of how the new version will manage to make the same substantial changes without breaking any code again, or it shouldn't be taken seriously.