Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Pipenv: Promises a Lot, Delivers Little (2020) (chriswarrick.com)
65 points by BerislavLopac on July 16, 2022 | hide | past | favorite | 65 comments


I'm really sorry, but I can never quite understand what the deal is with these tools are. I understand that virtual environments are important, but how are these tools any better than

    python3 -m venv .venv --prompt="foobar"
    . .venv/bin/activate
    pip install -r requirements.txt
?

Whenever I see all these other tools I just get the feeling like there's some big elephant in the room that everyone is battling against, but I've never come across it as a python dev, and the moment I try to user/understand these tools I feel like they're against the "keep it simple, stupid" vibe that python gives me.

There's auto-venv for the truly lazy (like me) but other than that I don't rely on any of these Env/requirement wrappers or tools in any of my projects (virtualenv-wrapper for work but that was in their setup guide).

Is it a legacy thing?


First of all some years ago pip wasn't really solving dependencies.

Then even with the new dependency resolving improvements it's way worse than poetry's in my experience.

Also the requirements.txt file tends to get cluttered with non top level dependencies that make upgrading your dependencies an herculean task. Instead on poetry you define your top level dependencies in your pyproject.toml and all the actual pinned dependencies will be compiled into a poetry.lock file.

And finally the user interface is much more modern. `poetry shell` is way simpler than `source venv/bin/activate`. You don't even need to run `poetry shell` because there's `poetry run`. And to install a new project is just `poetry install` instead of `python3 -m venv venv; source venv/bin/activate; pip install -r requirements.txt`.

Not to mention that it's way easier for me to understand where the depenency conflicts from console output compared to even the newest pip versions.


> Also the requirements.txt file tends to get cluttered with non top level dependencies that make upgrading your dependencies an herculean task. Instead on poetry you define your top level dependencies in your pyproject.toml and all the actual pinned dependencies will be compiled into a poetry.lock file.

This has now been solved [1] in pip, so no third-party tool is needed. Put your pinned dependencies in constraints.txt and your top-level dependencies plus the line "-c constraints.txt" in requirements.txt.

[1] https://pip.pypa.io/en/stable/user_guide/#constraints-files


Constraints are not the same thing. Poetry locks the entire dependency closure. You can install with “no-deps” option, but then you have to specify specific versions of every dependency in requirements.txt


What do you mean? If you do "pip freeze >constraints.txt", that locks the versions of all installed packages, no matter where they came from.

As an example, let's create a venv and install some older versions of Django and its dependencies (current versions are 0.4.2, 3.5.2 and 4.0.6)

  $ python3 -m venv env1
  $ ./env1/bin/pip install sqlparse==0.4.0 asgiref==3.5.0 django=4.0.0
...and also Flask just to complicate the constraints file for the example:

  $ ./env1/bin/pip install flask
Lock all dependency versions in constraints.txt:

  $ ./env1/bin/pip freeze >constraints.txt 
Create a requirements file that specifies just "django" and references the constraints file:

  $ echo '-c constraints.txt' >requirements.txt
  $ echo 'django' >>requirements.txt

  $ cat requirements.txt 
  -c constraints.txt
  django

  $ cat constraints.txt 
  asgiref==3.5.0
  click==8.1.3
  Django==4.0
  Flask==2.1.3
  itsdangerous==2.1.2
  Jinja2==3.1.2
  MarkupSafe==2.1.1
  sqlparse==0.4.0
  Werkzeug==2.1.2
Now we can create another venv with the exact same versions of Django and all its dependencies (but not Flask or its dependencies) using just pip and the requirements file:

  $ python3 -m venv env2
  $ ./env2/bin/pip install -r requirements.txt 
  $ ./env2/bin/pip freeze
  asgiref==3.5.0
  Django==4.0
  sqlparse==0.4.0


That’sa lot more than just

    poetry init
    poetry add x y z
    
Then later

    poetry install
And maybe

    poetry remove y


> And finally the user interface is much more modern.

I hear you on the pinning, but "UX" concerns like this drive me batty; I just use shell functions for this:

    venv() {
        . ${VIRTUALENV_FOLDER}/${1}/bin/activate
    }

    gitsu() {
        branch=$(git status | head -n 1 | cut -d ' ' -f 3)
        git push --set-upstream origin ${branch}
    }

    git_add_conflicts() {
        git add $(git status | grep 'both modified:' | cut -d ':' -f2)
    }
It's so much easier to add a little function in your shell than to write a new tool.


But I would prefer something that just works without me having to do that. Chances are it'll also save me work in areas I'm not aware of yet.


Oh me too, I love when a tool solves problems I didn't even know I had, until I have them, I look for a solution, and poof Vim (or whatever) has already solved it in a coherent way.

If every tool were like that that would be one kind of world, but we live in one that's more interesting. There's some Mark Twain quite like "the world owes you nothing; it was here first", or maybe the Buddha saying "it's harder to soften the earth than it is to wear sandals". I like to think that sandal-making is a core engineering skill, whether you're a carpenter or a computer engineer.


Are you guys actually using either poetry or venv on production?

Separately, I'm curious to hear examples of where dependency-clutter actually caused a problem.


We use poetry just for development, but run Docker containers in prod. When the image gets built we just create a requirements.txt (poetry export --format requirements.txt --output requirements.txt), copy that into the image, and pip install. Because this is built using the poetry lock file, it'll always be exactly the same unless we specifically update something with poetry.

I used to work at a place that was just using requirements.txt files that only included our direct dependencies. There was a project that needed updating after not being touched for a couple of years. The requirements.txt didn't change, but when we built the project again, some of the transitive dependencies used a newer version, and a bug was introduced from one of those updates. A bunch of time was wasted tracking down the issue, pinning the old version of the transitive dependency, and figuring out the damage caused by the bug.

As a result, the requirements.txt was changed to also include transitive dependencies. We had vulnerability scanning on our code, and it found a severe issue with one of the transitive dependencies, but there wasn't a version of that library with the issue fixed yet. Time was spent looking into this to see how we could be impacted. As it turns out, it was a transitive dependency for a library that we no longer used and removed from the project months ago. When you create your requirements.txt by running pip freeze > requirements.txt, you don't have an easy way of knowing which library requires which transitive dependency.

There's ways you can fix this using multiple requirements.txt files, but at that point it's a lot easier to use poetry, especially if you want to keep your development dependencies separate.


Venv, yes. My Dockerfile's build stage creates a venv and installs all our dependencies (some of which need compilers because they are source wheels that need to link with C++ libraries). Then I copy the venv over to the release stage and install only the release dependencies.

Maybe that could work with a `pip --user` install, but at least the venv guarantees that all Python packages and data will be in one self-contained folder.


The main problems with just "pip install -r requirements.txt" are that you might get a different result the next time you run it if a (transitive) dependency has released a new version, and that upgrading dependencies is an error-prone manual task. Both pipenv and poetry try to solve these problems.


Upgrading dependencies is also a way for me to resolve issues, so I don't fight it.


This is solved in pip nowadays with constraint files [1], so you can put your top-level dependencies in requirements.txt and all the transient ones in constraints.txt:

  pip install -r requirements.txt -c constraints.txt
[1] https://pip.pypa.io/en/stable/user_guide/#constraints-files


That's a feature, you should be using the latest minor/patch release when developing. If you really need to lock a dependency to a specific version it is possible, but you shouldn't be doing it. If a dependency is constantly breaking on minor/patch releases you shouldn't be using it.

I think it's a lot of javascript developers who end up having this problem, and all I can say is that the python package ecosystem is not the same. Please don't lock your dependencies to specific minor/patch versions, and please don't use so many dependencies that it becomes tedious to deal with changes. Especially when you're writing a library. If you're locking to anything other than a major release or a minimum minor/patch revision when producing a library things have gone very wrong.


I really couldn’t disagree with this more. Reproducibility is far more important than integrating unknown bug fixes, and upgrading dependencies should be a manual process during which you review the changes and verify that they don’t break your system.

My time and the time of my peers is too valuable to be debugging minor incompatibilities between a library 5 layers down the transitive dependency stack because versions aren’t pinned. It’s too important that if I go back to a 3 year old checkout of the code that it still runs and isn’t a wild goose chase of running down bugs and library incompatibilities.

You can always upgrade overzealously when dependencies are pinned, falling back to the workflow you’ve described, but the inverse isn’t true. If your tools don’t allow you to pin, you’re signing yourself up for churn at unknown and unpredictable times.


> My time and the time of my peers is too valuable to be debugging minor incompatibilities between a library 5 layers down the transitive dependency stack because versions aren’t pinned. It’s too important that if I go back to a 3 year old checkout of the code that it still runs and isn’t a wild goose chase of running down bugs and library incompatibilities.

That happens very rarely with our dozens of transitive dependencies. What are these packages that cause trouble often?


Nothing causes trouble often, because the class of problems is eliminated by pinning.

Still, it doesn’t have to happen often to be a serious hindrance to either developer productivity or your business. Imagine trying to roll out a fix for a production outage only to find a random transitive dependency has started breaking your build. Now your outage is extended to the duration of debugging and fixing an unrelated and avoidable problem.


>That happens very rarely with our dozens of transitive dependencies

Because you don't do ml. My requirements.txt is useless in 6 months because the transitive dependencies all have incompatible versions of common libraries by now


All we do is ML and image processing. We use huge libraries like PyTorch, ones with insane packaging like OpenCV, and things are so stable I haven’t felt the need to introduce version pinning yet.

Maybe you’re using recent less-stable projects? I found that the main ML packages rarely break things without months of deprecation warnings.


> That's a feature, you should be using the latest minor/patch release when developing.

You’re not wrong, but developers still need precise control about which version is pinned so they have a chance to figure out whether a bug is in their code or a regression in a dependency.

> If a dependency is constantly breaking on minor/patch releases you shouldn't be using it.

A more common example would be a dependency of decent quality, but there will still be bugs because there’s no such thing as bug-free software. And when the inevitable issue comes up, it’s super useful to have a switch that lets you pin down the regression and helps you figure out whether it’s on you or on the upstream project to fix the issue.

> Please don't lock your dependencies to specific minor/patch versions

Even if you allow a version range of "*" in your dependency file, it may still be a good practice for your app to have the resolved version number pinned in a dependency lock file, and even commit that lock file to version control.

> Especially when you're writing a library. If you're locking to anything other than a major release or a minimum minor/patch revision when producing a library things have gone very wrong.

You’re referring to the dependency file, not the dependency lock file, right? I couldn’t agree more. At the same time, it may still make sense for some libraries to have a lock file during the development process, and even commit that lock file to version control. You just can’t include that lock file in your release.


I don't agree. Repeatability is more important. Every developer working on a project should have the same version of each dependency. The testing server should have the same version. Every deployment should have the same version.

I don't want my tools "helpfully" upgrading me to a different version, I don't care if it's "minor". It's different and that's a potential source of heisenbugs. Any change to dependencies should be an explicit action and it should be a commit in SCM.

By all means show an ugly warning message to nag people to upgrade but computers live to serve us, not the other way around. The moment you start devolving these decisions to a computer you've lost control of your core competency, which is ultimately what code is running on the machine.


This simply doesn’t work on a large, shared codebase. Locking dependency versions is a must. You can’t just roll the dice that no breaking changes have been introduced by dependencies.


pip freeze > stable-reqs.txt ??



This thread is just about pip freeze not including hashes, and while it's true that would provide a lot of reassurance, you can still pin transitive version dependencies w/o them.


This kinda works, but it isn't a cross-platform solution.


You don't need virtual environments when using npm or composer or cargo.

They also create lock files with hashes by default.

pip still doesn't use lock files. It's subpar compared to other ecosystems.


I worry that the inclusion of lockfiles will make semver less relevant. You shouldn't need to use lock files if you're using semver properly.

If your ecosystem is healthy than pinning exact versions with lock files shouldn't be done. Making it so that every dev uses the latest patch or minor when they run your program.

Using lock files for libraries should absolutely never happen, you should at most fix your dependencies to the latest major version.


> You shouldn't need to use lock files if you're using semver properly.

I think this issue is unrelated to semver.

I've had so many issues over the years with Python packages, even very polished libraries like Flask and Celery.

For example, you could install something like Flask 2.0 but end up with a major difference in versions of its sub-dependencies depending on when you installed it.

That's because Flask has its own dependencies defined like this:

        "Werkzeug >= 2.2.0a1",
        "Jinja2 >= 3.0",
        "itsdangerous >= 2.0",
        "click >= 8.0",
The above means installing Flask 2.0 could one day in the future install Jinja 4 or Click 10 unless you lock your entire dependency tree.

I've also had all sorts of things break because Flask installed Jinja 3.1 which wasn't a problem 6 months ago when Jinja 3.0 was the latest release. I've also had cases where installing a specific version of Celery in the past worked but failed in the future because it didn't lock one of its sub-dependencies down well enough (vine) which caused a breaking change.

This stuff happens all the time and it's a nuisance. I never experienced issues like this with Ruby and other languages that have the idea of a lock file built into their package manager. IMO it's a desperately needed feature that should be built into pip.


Semver promises are insufficient in a world where supply chain attacks are increasingly common. Pulling untested and invalidated code in at every project build is how that transitive dependency on a package that was taken over for a small window wrecks your development team. You should never be pulling in new code by surprise, it should always be something I’m aware of and signing up for.

The Rust ecosystem is good evidence that locking doesn’t kill semver. Semver is still widely used and has all of its meaning.


> You shouldn't need to use lock files if you're using semver properly.

You mean "if all your dependencies are using semver properly". Yes, you're right, and you've found the problem.


I've experienced multiple minor and patch (according to semver) updates that broke APIs and behavior, and I'd guess most devs have as well.

I think semver makes sense to humans. I can derive a lot of meaning from it when I see n.n.n. But when it comes to the software supply chain, it's just too rickety. Frankly, when you lose customers after a new deploy broke one of your dependencies, "but the dependency author didn't respect semver" isn't an excuse.

I say this as someone who strongly pushed `dependency>=1.3.2,<1.4` until that happened to me. My argument was "security updates", and now I just don't care. The software supply chain is too chaotic, and you have to be defensive against it.


I agree, but they also include hashes.


The difference between virtual environments in Python and node_modules in JavaScript is minor I would say. It’s the same concept. I believe composer has something similar. It’s a directory where packages go and they are not part of the system wide installation.

I agree that lock files are useful and it’s a pity that pip does not offer them.


I just use venv but don’t fix dependencies. We often develop new packages and fixing every dependency’s version sounds like unnecessary effort.

With sensibly written setup.cfg files, we just “pip install” all our packages with no issues. Pip’s dependency resolver has come a long way since 2019.


What do you list in your requirements.txt? If it's just the top level dependencies, you could have versions change of your transitive dependencies which can matter and break things. If it's all dependencies, you lose the context of what you need because you use it and what you need just because it's a dependency of a dependency.

Some projects used to have two files for this, effectively managing a lockfile manually with pip freeze, but then it's nice to have a wrapper around this pattern and that's where the first gen like pip-tools came from, and then stuff like poetry/pipenv is aiming to streamline that (and avoid manual use of companions like pyenv for specific Python versons) even more.


I think the same

Some tools, like poetry do add value to your development process (namely, better management of dependencies)

pipenv really does sound like something for people that are too lazy to activate their envs before running stuff


Totally agree on this.


It's a bit of an old post, and yes Pipenv is not the go-to tool anymore. pip-tools is okay for people that really, really love their requirements.txt; otherwise we tend to go with Poetry at work.

Any folks having a good experience with PDM https://github.com/pdm-project/pdm ?


After a series of bad experiences with Poetry, I switched the packages I maintain to PDM. Although I have hit a few minor snags, the maintainer and other users in the project's github discussions have never failed to help with a fix, workaround, or advice. It's pleasant to use, and I think the only feature I cared about which was in Poetry but not PDM (a publish-to-pypi command), has proven easier to do with the Twine tool anyway.

My situation is I think unusual, in that I need to use a private pypi repo which requires mTLS for both fetching and publishing. Had it not been for that I suspect I'd still be using Poetry, but given the experiences I've had with PDM I wouldn't switch back even if the situation with my repo changed.


See my above comment, something to be aware of.


How is the maturity/stability of Poetry these days? I despise Pipenv, and was hoping to push for a switch to Poetry at my place of work a couple years ago, but I ran into blocking bugs across multiple versions (latest N versions affected by bug A, prior M versions affected by bug B). Had to chock it up to "not yet mature enough" and resign myself to the absurd lock times and countless terrible behaviors of Pipenv.


We use poetry for all python projects. I haven't seen an actual poetry internal bug in quite a while, but using poetry effectively does require one keep some things in mind that are probably non-obvious to newcomers:

1. Poetry's default assumption on packages respecting semver simply does not hold up in reality. There are very few packages actually sticking to semver. Thus the `^x.y.z` default version range is quite often too loose. I've found that using `~x.y.z` for most packages is far more stable.

2. Imho, `poetry update` is a footgun. Without specifier, it will attempt to update the entire dependency tree. Not only is this slow, together with 1) it's all too likely one ends up with dependencies that actually are incompatible at runtime. I'd much rather have a `poetry update --all` flag instead for the rare instance I do want to update everything. The default behaviour should be to require a list of packages to update.

3. There are some common packages that cause very long resolution times if they are not restricted. Case in point: boto3. Even if one doesn't use boto3 oneself, it's very likely a transient dependency. Many packages simply specify `'*'` as their version dependency (they shouldn't, but it's the unfortunate reality many do). This will cause poetry to consider every possible boto3 version. With hundreds of versions - boto3 has a release every other day - this gets unwieldy fast. So I often end up specifying boto3 myself with some sensible range in my toml file, even when it's not a strict dependency of my own project.

4. The datascience ecosystem needs particular attention. Best to simply pin those, as every pandas update is guaranteed to break something. ABI changes to numpy are a particular nightmare. This is again due to too many packages simply specifying `'*'` for their numpy dependency. Which is further complicated by the fact that most don't distinguish between build-time dependencies and run-time dependencies. The numpy ABI is only forward compatible. Hence one should build with the oldest supported numpy[0].

[0]: https://pypi.org/project/oldest-supported-numpy/


I'll take the behavior of 2 over Pipenv's "any re-lock aggressively updates everything" approach. What's the point of lock files if you have to peg everything in order to have stability / control over versions??


Updating dependencies (#2) does seem needlessly painful. I have wondered if I am missing some obvious workflow.


Seems very good, except they do seem to have some long running per-releases going which seems troubling. Just ship it already!


PDM is based on PEP-582, which is only a draft and you will hit edge cases where it's not supported, or projects refuse to support it because of this. I'd avoid it for that reason personally.

virtualenvs are a much more supported.


You can opt out of PEP-582, and in PDM 2.0 (just released), it becomes opt in.


What’s the point of using it over out Poetry then ? Why do we need a other ?


Is there a reliable source that tells what the current recommended way of doing this in Python is?

I’ve been using Anaconda and been finding that it install incompatible versions of jupyter and ipython dependencies and that a lot of tools I need only work from pip, so I’ve been wondering if maybe I should just switch to the “pythonic way of doing it” and have absolutely no idea what that is.


Slight digression, but I'm always baffled by how unbelievably slow Conda is at resolving dependencies as soon as you have more than a handful of packages installed. Mamba basically proves it's a solvable issue, but you'd think the Anaconda folks would be prioritizing it.


Today, we (and I mean me and my company) use poetry for prod code delivery and it just works. Poetry uses pip and virtualenv under the hood. It's worth understanding virtualenv regardless.

For every project you're developing, there will be virtualenv which has all the dependencies that project needs (which may be different than what's installed in the system, and different than what other projects may need).

"python -m venv init project/venv" will create it. "rm -rf project/venv" will delete it.

Usually the virtualenv goes somewhere well known, like "project/venv". Sourcing the activate script ( "source project/venv/bin/activate" ) changes your shell environment to use the virtualenv instead of the system python environment. Once activated "pip install package" installs to the activated virtualenv. "deactivate" will turn off the virtualenv.

This is semantic sugar for what's really happening behind the curtains. There's a copy of python in the virtualenv: "project/venv/bin/python" which runs in the virtual env regardless of whether the virtualenv is activated or not. "activate" just adds "project/venv/bin" to the start of the PATH. "deactivate" removes it.

Regardless you can always see which python you're using by typing "which python". System python (/usr/bin/python) will use system packages. The venv python (project/venv/bin/python) will use the virtualenv python packages.

This allows you to have different virtualenvs to try out things like new versions of python, say. And each virtualenv is isolated from every other virtualenv.

And poetry is just a nice wrapper around this process that also figures out total project dependencies and creates a "lock" file to freeze all deps to a particular version for consistent releases. Pipenv is basically the same thing.

"pip install --user poetry" will install it in your home directory.

It's not recommended to use conda and pip together, basically the recommendation is to use one or the other, though I've heard miniconda is better in this regard. YMMV.


I have seen the Hypermodern Python series of blog posts referenced here a few times. I'm a big fan of that approach personally.

I'd also be curious what others recommend!

https://cjolowicz.github.io/posts/hypermodern-python-01-setu...


(May 2020) so this piece is out of date.

I use pipenv on one of my projects, including (in combination with direnv) for setting up my local dev environment on macOS and as part of CI/CD for deploying into a Docker container based on one of the Python images. The project doesn't have a huge number of dependencies, but pipenv has worked well. The only time I fought with it was over its PyUp safety checks, but "pipenv check" is opt-in anyway.

For most of my other projects, I install from either a setup.py or requirements file using pip, often driven by a Makefile. Again, deploying into Linux, but using macOS for local development, again with direnv.

In 20 years of professional Python development on mostly small-to-medium size projects, I've never had any issues I can recall getting packages installed.

Conversely, I semi-regularly fight with cocoapods, npm and gems.


Seems like instead of focusing on python's execution speed, we've all been saying (for a long time) what we really really need in a language (like python) is simplified/centralized/pythonic(one way to do it) tooling. Python, isn't popular because it executes close to C speeds. It's popular because it generally removes common obstacles for the developer. The package manager for python is a common obstacle that many open source repo's are trying to solve because python proper hasn't solved it (or attempted to with enough focus).


This is in 2020 and should be marked as such.


I've started to get away from virtual environments and just use Docker containers where all the pip dependencies can live as first class citizens. Also eliminates having to source the activate file or wonder if I'm calling the correct version of Python when I want to ssh in and run some code by hand in the container.


Containers are basically language agnostic virtual environments done right. Having python or js specific solutions is really a hack.


Lots of folks are saying pipenv is outdated (and I agree), but it is still listed on the packaging docs [0] with little warning, and as the second tool on the official list of recommended tools [1]. If the community wants to deprecate this extremely undermaintained tool, we need to stop pointing people at it.

0: https://packaging.python.org/en/latest/key_projects/

1: https://packaging.python.org/en/latest/guides/tool-recommend...


My journey went from pipenv to poetry to pipenv back to pip and then to pip-tools (pip-compile and pip-sync).

While both pipenv and poetry were nice and had a lot of comfort build in they broke in nasty ways and it was hard to debug so I traded the comfort for less complexity.


At work, we have large projects that use lots of technologies, including for example Java, Erlang, Rust, Kotlin, Groovy, even a little bit of C code, which tends to be problematic... and Python.

Our developer environment is well polished and can be started up with installing a couple of packages then running a single command... very nice, except for the Python part , which is extremely tiny compared to the other stacks.

We used pipenv, but that broke very often, so we moved to Poetry... it looked like it worked for a while, but as we got more people, specially ones using Linux and Mac M1 had to spend lots of time fixing issues with Poetry/Python... something to do with cpython dependencies downloaded for the wrong architecture.

We've had zero, absolutely zero issues with all the other stacks. We're trying to find solutions, but IMHO we'll just have to bite the bullet and re-write the small Python component we have into any of the other many languages we use without any issues.


If you don’t like a tool then don’t use the tool.

Or fix what you perceive is wrong with it.

Don’t demand the tool do what you want or insist on a certain release cadence unless the developers are on your payroll.

I admittedly started skimming around halfway through but still want a refund on the time I wasted trying to determine if there was more to TFA than a whinefest.


What is the currently accepted way of nailing down a specific python version if not using something like pipenv?




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: