Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Python 3.9 is around the corner (lwn.net)
71 points by rbanffy on Sept 29, 2020 | hide | past | favorite | 79 comments


Argh, Python is moving too fast. 3.8 has barely settled in with the walrus operator.

For people doing docker based stuff with the latest and greatest always it might be OK, however, for enterprise environments where OS versions do not move as fast and you have runtime dependencies everywhere it isn't good.

We have to build C++ libs against python versions, have thousands of users having their workstation in a sane version of python globally where all dependencies (1k+ in-house/external packages) work well and etc. etc. (think CAD kind of software / Qt UIs)

We were still in the process of migrating towards python3.7 and it was a huge effort for barely any real gain. Hopefully new libraries do not use 3.8/3.9 only features otherwise we're in for a bad time..


Python 3.8 will still be supported until 2024, that is, 5 years from its release. I guess that's fair enough.


> Hopefully new libraries do not use 3.8/3.9 only features

Ideally, for production environments, you'll want to pin version numbers for everything you can, so new packages don't intrude. I also like to keep testing environments with minimal dependencies and no pinning (I built `pip-chill` for that). Wouldn't that work for you?


The problem is vulnerabilities. Your vulnerability scanner says you're pinned dependency has a vuln, and the only solution is to upgrade versions.

By and large, libraries don't backport vulnerability fixes. You have to update from 1.0.1 to 2.3.3 because they only were notified of the vuln after 2.3.2 was released. In practice, almost no open source libraries would also release a 1.0.2. Maybe some of the largest most well-run libraries will, but enough don't that you will run into this problem.


> In practice, almost no open source libraries would also release a 1.0.2.

Well, if package consumers such as yourself don't put in the legwork to make that happen then yeah, don't expect others to do the work you personally need for free.

I'm sure that if you posted a pull request for that hypothetical 1.0.2 release then the project would gladly receive it with open arms.

People should keep in mind that open source software isn't a one-way street.


I don't think the parent comment argues that anyone should backport their changes for free. Their point is, upgrading everything is cheaper than backporting security fixes.


That works but doesn’t solve the problem mentioned by the other comment. We also have to support a matrix of Python runtimes and dependencies in all of our machines. We can’t pre-bake the environments. It’s a very unique problem for HPC I guess. We don’t use pip, have a look in the Rez package manager if you haven’t yet for an idea of the problem.

For our webservices it is much less of a problem as we can isolate the runtime from software installed in machines or third party software dependencies that we integrate with (think a blender plugin, blender comes with its own versions of dependencies)


Rez looks really great! Thanks for sharing.


> We don’t use pip

What's the problem?


Have you read about the problems Rez solves ? Our environments are too dynamic and need to always be available. Our hpc cluster runs millions of different combinations of software runtimes at any given day...


My understanding is that it operates on a different level than Pip.


Hm, not really. It has build scripts which you do whatever you want to build your package and put somewhere and then it adjusts PYTHONPATH dynamically at runtime.

You don't need Pip if you're using rez. You also don't use a setup.py, you use a package.py instead.


That helps not to break an existing codebase.

But not if there's a nice new library you really have a need for that uses new 3.8 features, and you can't use it because you're stuck with 3.6 for example...


We can always offer a PR that allows the library to run on older versions. I am limited to 3.6 on many environments and would be extremely happy to know others on 3.6 have better library support.

Having said that, It's not something I feel. I know some things can get faster and some syntax can be improved, but, so far, I'm pretty OK with those envs running on 3.6.

As for me, locally I use 3.8. If I do any 3.8-ism, CI will catch it.


Genuinely asking because you’ve described the sort of CI/CD problem that there are existing solutions to, things like the Open SUSE build service, red hat have their own tools, commercial services exist from several vendors designed to tackle these “operating system permutations” build problems... is one release a year with security fixes going back four versions... really too fast?

All the gory details about the release timeline are in PEP 602 - https://www.python.org/dev/peps/pep-0602/


I believe the biggest issue is having non compatible additions to the language... for example, any code written in 3.8 with walrus operator will not work in python3.7.

That is IMO not ideal. If it was security fixes / new features / improvements with no new syntax then I would be OK.

I would say a lot of companies still have a very big legacy code base in 2.7... new services are created in docker using the latest python3 but desktop applications aren't as easy to upgrade constantly.

Some companies also use NFS to store python libraries (we release 100s of times per day) and make some wrangling of PYTHONPATH to allow applications to find dependencies. Having to support multiple (incompatible) versions of python at a workstation isn't a good problem to have.


Backwards compatibility I understand, but which languages give forward compatibility?

I mean how would that with new features even work?


>Backwards compatibility I understand, but which languages give forward compatibility?

Python used to offer it frequently. from __future__ import ...


Code written in C89 can trivially link against code written in C18. This also mostly works in C++ as long as new features are not used at the interface level and you don't have an ABI migmatch (e.g. with stdlibc++ std::string, although that's possible to work around with proper configuration). Any language exporting a C FFI should work just fine.


Hm not exactly what I tried to convey. I guess I would love if python stopped trying to add new syntax so often.

Walrus operator, fstring, etc. should happen maybe once every two years ideally? Did we really need the walrus operator? The cost of adding it is making any library using it in python 3.8+ incompatible with runtimes of python3.8<.


In the ideal world, library developers would properly use the python_requires[1] option and dependency resolvers would correctly identify the latest version of a library that works with the installed version of Python. Unfortunately I have found that is not always the case.

When I make public modules, I personally weigh whether using a new feature is worth losing users stuck on an older version. That sweet spot for me right now is targeting Python 3.6 and using backported modules where necessary (e.g. mypy-extensions). I'm a big type hint fan and type hinting became much more usable in >= 3.6.

[1] https://packaging.python.org/guides/distributing-packages-us...


> We were still in the process of migrating towards python3.7 and it was a huge effort for barely any real gain.

Besides the pursuit for the latest and greatest, is there really anything forcing you to upgrade?

I mean, Python 3.7 will be around for at least 2 more years.

https://www.python.org/dev/peps/pep-0537/


The problem is the "huge effort", not the "barely any real gain": you are not automating enough. Building and packaging "1k+ in-house/external packages" should be longer than building only one, but not more difficult with proper scripting and build systems.


Yeah of course :) we have terabytes of Python code written over the past 20 years that support our business. Good luck justifying to the business to stop producing code that generates revenue to “modernize” all that stable code.


Are you migrating to 3.7 from 2.7, or 3.6?


I would say 90% of our code base is still in 2.7 and 10% in 3.6.

That’s for the desktop applications, we have a bunch of services in 3.8 already.


The Python packaging "solution" is to have a whole Python world for everything you want to run (aka insanity).


Yes, and snap/flatpak's idea of software installation is a whole container per software.

You should choose one: Isolated environments or dependency hell. Each has its own pros and cons.


Heard of virtualenv?


Yes... that's exactly what I was referring to. The problem is this: https://github.com/pypa/packaging-problems/issues/328


I've been thinking of offering a patch to `fdupes` to turn duplicates into CoW copies. ZFS and BtrFS have them covered, but I'm not sure about APFS and I bet NTFS doesn't.

In any case, it's really hard to get a workstation with less than a terabyte of storage in it. I have 2.9 gigs under mine and that's for 53 environments. I'm not sure that's a huge issue.


Sure in some use cases it doesn't matter. In others (e.g. on an HPC cluster where you have a quota in your home directory measured in GB) it does. CoW won't help on GPFS or Lustre, probably. Arguably Python has no place in HPC but a lot of people use it. And on my laptop technically I have the space but I am loathe to give it up to duplicate various packages many times.


It's somewhat dangerous, but you can always replace dupes with links that work in your filesystem. Lustre, IIRC, supports hard links. I'm not sure whether GPFS does that, but I don't see any reason for that. If that fails, symlinks should work. The Python environment directories shouldn't be writable anyway.


> The Python environment directories shouldn't be writable anyway.

Yeah, except people start putting together their own Python environments in their home directories because that's what third-party packages suggest and asking admins to install random packages globally for random one-off tasks is not really sustainable. I'm sure there are workarounds, but it would be nice if virtualenv had some automatic way of figuring this out.


Sorry. Not what I meant. Your programs shouldn't write to these directories while running anyway, so once deduplicated, they should remain that way.


Ah sorry! I see what you mean now. I guess it wouldn't be so hard to write a script to deduplicate a bunch of virtualenv environments, but I'm scared to be the one to go down that rabbit hole.


There is a version of fdupes that makes hard links. Maybe it's enough for what you need.


What would you suggest instead for HPC?


Well, generally you want something with better performance characteristics, which increasingly is not the case (see e.g. https://www.nature.com/articles/s41550-020-1208-y%C2%A0). Though, even if your stuff is implemented in Fortran or C or C++ or on a GPU, Python is still a natural scripting language that is often the primary interface to some other piece of code, thanks to swig or boost::python or cppyy. But Python's fast rate of change and lack of backwards compatibility makes it problematic for this use. Our HPC cluster recently forced everyone to upgrade to Python 3 and that caused all sorts of problems for everyone. I understand there isn't really a choice with the lack of Python 2.7 support, but this has not been a great experience for anybody.


Yeah totally. Our use cases seem very similar. We still haven't transitioned to python3 at our HPC.

Well.. we do have it available but only around ~10% of the workloads make use of python3 and their libraries. We have created the two different paths in our NFS and if you're in python3 then you can't access python2 and vice-versa.

We will be migrating everything to python3 eventually but as you noted it is a big effort with a lot of possibility for chaos.



I agree, but having said this: please consider subscribing to LWN. It’s one of the few places with high quality articles like these, which is increasingly rare nowadays. Subscribing and supporting them directly helps the cause of keeping projects like these alive.


For the record LWN editor corbet has said in the past that he's OK with LWN subscriber links being posted here:

https://news.ycombinator.com/item?id=1966033

Also, the LWN FAQ says that it's OK as it serves as marketing and helps get new subscribers:

https://lwn.net/op/FAQ.lwn#slinks

P.S. I am a long-time paying LWN subscriber...


Maybe for myself and others.. what is LWN explicitly? (Link)


Oh geez. Immediately upon asking I see in the parent post sorry


> Type hints in Python are mostly for linters and code checkers, as they are not enforced at run time by CPython. PEP 585

When you think about why that is, it makes sense, but I still find it annoying that while my type hints are not enforced, if I type-hint to an object that doesn't exist, I still get a run-time error.

I kinda wish that these hints would just be interpreted as comments during runtime.


This doesn't solve your problem, but for anyone else reading who may not be aware - there is (at least one) library that does runtime checking with your type hints: https://typeguard.readthedocs.io/en/latest/userguide.html#us...


Since Python 3.7 you can postpone evaluation of the annotations[1].

You can do something like this:

  from __future__ import annotations

  Class Node:
      parent: Optional[Node]
[1] https://www.python.org/dev/peps/pep-0563/


There’s a good pattern for this using “if typing.TYPE_CHECKING:”... by either only importing them while type checking or by checking “if not typing.TYPE_CHECKING:” you could set an object type like Any, or your own custom one, as the value of all your imported types which should avoid most runtime errors I can think of.


I'm not sure what do you mean by "type-hint to an object that doesn't exist"?

The objects are not type-hinted; variables are. A variable in Python will always have a value (because it's just a label on a value, not a container), and that value will always be an object, and objects in Python always have a type.

If you mean that you are passing a variable that can have a value of None, this is what Optional[<type>] is for... Or maybe I'm missing something -- can you give an example?


    import numpy as np

    def f(x:numpy.array):
        pass
Since numpy doesn’t exist in my namespace I get a runtime error (NameError) for a part of my code that supposedly isn’t used.

I just feel like this should throw errors when running mypy and not when running my script.


This is exactly what happens starting with Python 3.7, which implements PEP 563 [0]; you just need to enable it by importing annotations from __future__. Starting from Python 3.10 that will be the default. [0]

[0] https://docs.python.org/3/whatsnew/3.7.html#pep-563-postpone...



Perhaps it is just about the features I personally use, but I find generally Python releases are mostly cosmetic. Big changes are included, but don’t drive the changes.

2 -> 3 was mostly about Unicode and print being a function. There was a ton of operational improvements, but none of them were breaking changes that warranted a change.

Type hints are cute, but as other remarked, they are glorified comments. Mypy worked without them in 2.x . Not saying it’s bad to have them, but that it is cosmetic.

From my point of view, the major improvements are on side tracks: - multicore support via sub interpreters (multiprocessing et al is clumsy and awful on Windows) - some modicum of static safety; like constants and declaring new variables to help with typos.

You might say that these don’t fit in with Python, but I disagree. It would genuinely be a better language, in particular for its current use cases.

But no, Unicode and fancier comments / annotations it is.


> Type hints are cute, but as other remarked, they are glorified comments. Mypy worked without them in 2.x . Not saying it’s bad to have them, but that it is cosmetic.

That’s underselling them. They are definitely more than glorified comments and have unlocked a bunch of cool things like data classes and typed dispatch, as well as making a generally big impact on the ecosystem. Mypy worked around not having them in 2.x with what amounted to a hack, which nobody seriously liked using for a number of reasons.


I'm certainly glad they are there. It's a better language with them. My point is, they are really just syntactic sugar.

Leaving mypy aside, attrs works perfectly well without annotations, and frankly I find it more powerful than dataclasses anyway.

As I said before, maybe it is just that my use-cases are so different to others' (data science vs. web etc.). But Python is screaming for proper multicore support, better static safety, and presumably other things too. Instead we get niceties.

And yes, of course, it's free, and it's great, and I'm not a contributor (not really in a position to contribute). So I'm not complaining. But also I gained next to nothing from upgrading from 2.7 to 3, and would gain nothing from going up the 3.x ladder. So it's hard for me to whip up enthusiasm about another version of type annotation goodness.


> Type hints are cute, but as other remarked, they are glorified comments. Mypy worked without them in 2.x . Not saying it’s bad to have them, but that it is cosmetic.

Thus assertion is simply outright wrong. Type hints were standardized and, albeit optional, they provide a fixed target for implementations to provide the same service in a perfectly interoperable way.

You might not see their point because you chose to not follow best practices, but others among us are very glad Python finally added official support for those.


And why might you assume I don't use them?

I do, and they are useful. If I had to type them in as comments, I would still use them, and probably not suffer too much.

They are a good thing, but I really think Python has bigger issues ahead. I'm beginning to wonder if Julia will overtake it in scientific programming, a notion I would have considered ridiculous a mere year ago. And that, I think, would be a shame - good job for Julia, but for me, the real magic sparkle of Python is that it is a first-class choice at so many things, _including_ data science. If the herd moves on, and Python becomes yet-another-2nd-choice for data science, that would be a real blow to its ecosystem.


> this [article] will become freely available on October 1, 2020

Hope to see this resurface in 2 days!


> Developers should be aware of some features that are being deprecated and removed in 3.9, as well as some more deprecations that are coming in 3.10. Many Python 2.7 functions that emit a DeprecationWarning in version 3.8 "have been removed or will be removed soon" starting with version 3.9.


Subscription required. We can check this instead:

What’s New In Python 3.9

https://docs.python.org/3.9/whatsnew/3.9.html


Like the muon. "Who ordered that?"


It's funny because that joke would work a lot better with tauthon.


What I like about regular, shorter (12mo vs 18mo) is it helps users exercise their "upgrade muscles".

For those that care, they can establish procedures and then practice them (and improve them) to update their dependencies. This helps immensely down the road when security patches are released or other dependencies have significant updates. It turns such updates into boring non-events.

Otherwise users only update their dependencies due to security threats or dead dependencies, and it always results in a stress-filled circus usually followed by outages & bugs.


I would counterargue that it adds stress for no good reason, and it leads to exactly the opposite result.

If I want to collaborate with someone we now need to agree on multiple versions of multiple libraries - if my code needs at least version 3.6 because I use f-strings but their code needs 3.5 because a library will otherwise break, we now have a stress point that we don't need.

Even worse, I might have to collaborate with someone who is not a programmer (very common in data science) and now I need to spend an hour guiding them through the update process plus time in the future to fix whatever the upgrade broke in the background. For a language that made "easy to use" a selling point, that is less than ideal. (Side note: it also makes Python the only language for which I regularly need to check the date in Stack Overflow answers, as those get outdated faster than, say, Bash scripts).

How does that get solved in practice? In my experience, by pinning everything to 3.5 or 3.6 and never updating.


> If I want to collaborate with someone we now need to agree on multiple versions of multiple libraries - if my code needs at least version 3.6 because I use f-strings but your code needs 3.5 because a library will otherwise break, we now have a stress point that we don't need.

The longer the interval between releases the more likely that kind of library incompatibility is, and the more likely an upgrade becomes too much trouble to be worthwhile. If you have to upgrade one library to be able to upgrade your Python version, you can probably manage to test that. If you have to upgrade four libraries you're more likely to give up and pin to the version you're currently on.


I think the kind of breaking changes people want to actually use are less frequent than incompatible changes in general. E.g. this zoneinfo library is something where the reaction is "great, I can drop a dependency in 3-4 years" while fstrings are a lot more compelling to some people as it fixes a papercut they have with long format calls. I don't see many libraries raising their minimum python version for zoneinfo as a result, while libraries did for fstrings.


Withstanding stress is a sufficient good reason. When you need to deploy a security fix with a deadline of a week ago, it has to go smoothly; and it's going to go smoothly because everyone involved knows how to update everything reliably and efficiently.

> now I need to spend an hour guiding them through the update process

The update process should consist of deleting site-packages and Python binaries, installing Python, and running a simple script consisting mainly of "pip install x y z" steps. Or nowadays replacing a whole Docker coffin with an updated one. Can you elaborate on the required "guiding"?


Python's release policy is pretty sane as they try to overlap maintenance periods. Python 3.8 will still be supported for another 3+ years after 3.9 is released.


Three years is like tomorrow for breaking changes. I have come to apriciate the slow moving tools more and more the older I get.


There aren't any breaking changes in 3.9.


I quite value learning languages deeply. That isn't really possible when you're continually bridging multiple subtly incompatible python versions. Great for throwaway code, not for building anything of value.


This is true, but shorter intervals are also exhausting. It gets less tiresome with practice, but I really wish there was a stable environment to work in sigh.


You can always pin to an older version for both your runtime and your libs. That seems, naively, like it would produce exactly what you're looking for. Perhaps I'm mistaken.

Though this will tend to come at the cost of making upgrades and patches into significantly more work, as they will be larger and you less used to doing so. As the user said, "a stress-filled circus usually followed by outages & bugs".

What kind of stable environment to work in do you imagine?


The problem is when you depend on a SAS like Google Cloud Storage or some such. If you stay pinned, eventually the service can break. I don't have a great solution, it's just such a grind to keep things updated.


I agree. It's such a pain when one cannot simply sit athwart the world and command "Stop!". I find the world can be a very inconvenient places at times.

From where I'm sitting, the best option on offer for coping with this inconsiderate world is to keep in practice applying your updates on a regular cadence. It may be a grind, but it's perhaps one that gets easier as you do it more often. It certainly makes for smaller, more regular updates that in my experience are often more manageable.


So skip a minor version here or there. Problem solved.


Indeed, imho it's also better to have more smaller update cycles with their associated issues spread over time then having all those issues at the same time while external pressure is also pushing. to get that security fix or bussines critical feature out.


This might work acceptably if you do most or all of your programming in Python anyway. However, if you try doing it as someone who works on a lot of different projects within that 12-18 months, using many different languages, then the kind of dependency hell that exists in ecosystems like Python and JavaScript gets old very fast. In that environment, having every language you use invent its own toolchain and conventions for package and dependency management is anything but helpful.

There is a lot to be said for having language tools that compile your code, compile or directly link in any external libraries from wherever you chose to put their files, and then simply give you back a single native executable file.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: