I recently blogged about the Linux world needing something like this. http://let...

_m7bj · on April 29, 2015

https://xkcd.com/927/

igravious · on April 29, 2015

So cynical. But unfortunately, so true.

Digit-Al · on April 29, 2015

Ha! That's exactly the first thought I had. I guessed someone would beat me to it :-)

davexunit · on April 29, 2015

>There must be some way to _unify_ this proliferation of software update mechanisms.

https://gnu.org/s/guix

https://nixos.org

cwyers · on April 29, 2015

Unify != replace. Guix/Nix don't integrate multiple package managers, it just is Another Package Manager.

Doji · on April 29, 2015

Nix runs on any linux distro because all packages are stored in /nix, so it's not trampling all over your files in /usr, /bin, etc. It also runs on OS X (packages work, but are a bit less well tested, and the binary replacements are currently not as easily available as they should be). In the past it has been made to run on BSD and Windows through Cygwin, but these need a lot more work before being production ready.

It's also significantly more flexible than other package managers due to how it can handle multiple versions of the same package in an elegant way. This makes unifying diverse package sets an actual possibility. Apt, yum, etc are too limited to unify package sets in this way, an creating another layer above them in order to integrate them is hacky, ugly, and unlikely to work well.

davexunit · on April 29, 2015

They are the only package managers I know of capable of unifying package management with a single set of tools. Unifying multiple distinct package managers is a fool's errand.

cwyers · on April 29, 2015

I don't see how they can. Guix/Nix don't run on Windows, so they can't replace the NPM or PIP or Ruby Gems, first off. But I don't even see anything about Guix/Nix that make them especially capable of unifying Linux package management. (Heck, given that there's two of them, I don't see how either can be said to unify the specific kind of Linux package management they do.)

davexunit · on April 29, 2015

>Guix/Nix don't run on Windows

Correct. On Windows you can't do any better than a swarm of language package managers. Without the ability to create a custom Windows distribution starting from a systems level package manager, there's no hope in having anything better.

igravious · on April 29, 2015

> Unifying multiple distinct package managers is a fool's errand.

I'm really hoping that isn't the case.

zokier · on April 29, 2015

> Unifying multiple distinct package managers is a fool's errand

I disagree. Why do you think that is the case?

davexunit · on April 29, 2015

Different package managers make different design decisions that make them incompatible with each other without sacrifice. Guix/Nix focus heavily on reproducibility and not relying on any third party binaries. These features would have to be thrown away if it unified pip, npm, etc. because they make no such guarantees.

Every package manager works different, and trying to accomadate all of them with a unifying tool will require a lot of time wasted writing interfaces between them all. I don't see a way to do it without lowering the feature set to the least common denominator and settling with that. The real solution is to elevate our system package managers to the point of handling the important use cases that currently only language package managers provide, such as virtualenv/Bundler style management (i.e. installing packages somewhere besides /). Nix/Guix accomodate all such use cases, which is why I promote them.

zokier · on April 29, 2015

> These features would have to be thrown away if it unified pip, npm, etc. because they make no such guarantees.

Again I disagree. For example in this case, it would just mean that when installing from nix repos you get the reproducibility etc guarantees, installing from upstream repos would get you vanilla version and installing from OS repos would get you an integrated/patched version. But the installation process (from user point of view) could still be unified, because regardless what happens behind the scenes the high-level stuff is pretty much the same; nix -i/apt-get install/pip install could definitely be unified under one umbrella tool.

> I don't see a way to do it without lowering the feature set to the least common denominator and settling with that.

And the common denominator would probably cover 80-90% of uses. For the rest you'd still have to option to delve deeper and use some implementation-specific tools if necessary.

davexunit · on April 29, 2015

I just think it's a mistake to believe that the solution to the proliferation of package managers is to write a package manager manager.

igravious · on April 30, 2015

I'm not sure that anybody is advocating taking that route as the way forward...

sparkie · on April 29, 2015

If we've any hope to unify package management, we need to get to the essence of what package management is. It's really quite simple though - it's the ability to say that one piece of software depends upon another, and to have a piece of software which can automatically resolve the dependencies (which form a DAG). To construct our DAG we need a list of nodes (the packages), and a list of edges (the dependencies of a package).

If we say that packages are basically just binary blobs of data (say, a .tar.*), then we might construct our database to identify (key) our package payloads. I'll use some pseudo pgsql for illustration purposes.

    CREATE TABLE packages
    (
        payload bytea NOT NULL,
        package_name character varying NOT NULL,
        CONSTRAINT pk_package PRIMARY KEY ("package_name")
    );

    CREATE TABLE package_dependency
    (
        dependant character varying NOT NULL,
        dependency character varying NOT NULL,

        CONSTRAINT package_dependency_dependant_dependency_key UNIQUE (dependant, dependency)

        CONSTRAINT package_dependency_dependant_fkey FOREIGN KEY (dependant)
            REFERENCES packages (package_name),
        CONSTRAINT package_dependency_dependency_fkey FOREIGN KEY (dependency)
            REFERENCES packages (package_name),

    );

Simple. But we're mising a bit here. We need to update software, so a name is not sufficient to identify a dependency. Lets add that.

    CREATE TABLE packages
    (
        payload bytea NOT NULL,
        package_name character varying NOT NULL,
        version integer NOT NULL,
        CONSTRAINT pk_package PRIMARY KEY (package_name, version)
    );

    CREATE TABLE package_dependency
    (
        dependant character varying NOT NULL,
        dependency character varying NOT NULL,
        dependant_version integer NOT NULL,
        dependency_version integer NOT NULL,

        CONSTRAINT package_dependency_pkey UNIQUE (dependant, dependency, dependant_version, dependency_version),

        CONSTRAINT package_dependency_dependant_fkey FOREIGN KEY (dependant, dependant_version)
            REFERENCES packages (package_name, version),
        CONSTRAINT package_dependency_dependency_fkey FOREIGN KEY (dependency, dependency_version)
            REFERENCES packages (package_name, version)
    );

Cool, now we have a composite key we've got sufficient information to idenfity a dependency right? Well no, we now have the problem that the same piece of software with the same version could be distributed by different vendors (with different dependency chains/configurations, etc). We had to modify the original solution to get here rather than extend it. Let's modify it again!

    CREATE TABLE packages
    (
        payload bytea NOT NULL,
        package_name character varying NOT NULL,
        version integer NOT NULL,
        vendor character varying NOT NULL,
        CONSTRAINT pk_package PRIMARY KEY (package_name, version, vendor)
    );

    ...

Great, now given a combo of package_name, version and vendor, we can uniquely identify a dependency without worry. All problems solved?

What now if Vendor has a customer with different needs, and must distribute two different derivations of the same package version and number? Do we add another field for "configuration", and if so, what type do we make it? Do we just rename the package and lose the relationship that exists between them? It should be blindingly obvious by now that we're just trying to add structure where it isn't really present, and we're making the solution to the problem more and more complicated.

Now take into account the possibility that Package Manager A implements dependencies using a tuple of (package_name, version, configuration), and Package Manager B implements dependencies using a tuple of (package_name, version, vendor), then if we want to unify these models under a "one true package manager", then our OTPM needs to model things using ("package_name, version, optional[vendor], optional[configuration]), and so forth. Multiply by N package managers with their own individual quirks and you get a "unified" model which is barely unified at all, the only structure to it is really our initial solution - packages with names.

Here's a reduction in complexity: Instead of using a name, which may be ambiguous and therfore requires us to constantly add fields and change the underlying model - let's propose we have some means of creating identifiers with some means of guaranteeing uniqueness. We can basically go back to our initial model:

    CREATE TABLE packages
    (
        payload bytea NOT NULL,
        identity uuid NOT NULL,
        CONSTRAINT pk_packages PRIMARY KEY (unique_id)
    );

    CREATE TABLE package_dependency
    (
        dependant_id uuid NOT NULL,
        dependency_id uuid NOT NULL,

        CONSTRAINT package_dependency_dependant_id_dependency_id_key UNIQUE (dependant_id, dependency_id)
 
        CONSTRAINT package_dependency_dependant_id_fkey FOREIGN KEY (dependant_id)
            REFERENCES packages (identity),
        CONSTRAINT package_dependency_dependency_id_fkey FOREIGN KEY (dependency_id)
            REFERENCES packages (identity),
    );

Now, if we want to integrate "Package Manager B" into this new model, we can extend our system. (Note keyword extend, not modify). We can implement a new table which references the base model.

    CREATE TABLE pmBpackages
    (
        identity uuid NOT NULL,
        package_name character varying NOT NULL,
        version integer NOT_NULL,
        vendor character varying NOT NULL,

        CONSTRAINT pmBpackages_identity_fkey FOREIGN KEY (identity)
            REFERENCES packages (identity)
    );

And as far as pmB is concerned, this is just an implementation detail - we can hide it from any users and just present the legacy view to them.

    CREATE OR REPLACE VIEW "pm_B_view" AS 
        SELECT package_name, version, vendor, payload
      FROM "pm_B_packages" NATURAL JOIN packages;

So here's a challenge. Begin with the schema for "Package manager A" as implied above, and try to implement "Package manager B" by extension (not modification). The first point of struggle might be to notice that you need to invent "configuration" values, since "Package Manager A" requires them as part of the key, and they're NOT NULL.

Hopefully it becomes obvious now why trying to build another model on top of other overcomplicated models is the real fools errand, because none of them so far have understood the essence of the problem.

I've glossed over how we might guarantee uniqueness for package identities so far. The solution is to use a cryptographic has of the payload, under the (fair) assumption that a modern hashing algorithm is sufficiently collision resistant. There's nothing language/framework/operating system specific about SHA-1 or whatever the choice of algorithm.

zokier · on April 30, 2015

I think you are massively overthinking this. Dependency tracking and all that jazz are just implementation details of individual backends, the unifying tool does not need to be aware of that at all.

Let's say we define basic API with two verbs: SEARCH and INSTALL. When the user wants to install a package the unifying tool first queries backends with SEARCH to see if they have such package, and after resolving which source/backend it wants to install it from (a process that might involve user interaction), and then invokes INSTALL to that backend. Nowhere in this process the unifying tool needs to know what black magic the backend needed to do to get the package installed.

sparkie · on April 30, 2015

SEARCH is the reason package management is so broken. If a developer intends a particular piece of software to be installed as a dependency, he should convey that to the users, rather than a vague criteria for which the user might hopefully get the right thing. The search needs to return only one result - the right one. If several package managers keep a package of the same name and version, we shouldn't need to keep adding criteria to narrow down our search until we get the right one (and someone could later add a package which meets all those criteria after you've published).

The heart of the problem is making a SEARCH which will always return one result, and to do this we either need to add N criteria which are collectively guaranteed to be unique - or to just stick a unique identifier there to begin with and treat the rest as information queryable from the content.

The use of a hash as an identifier solves several other problems such as having multiple repositories whereby each one could host packages of the same name and such. While we continue to rely on search without some means of uniquely identifying packages, we continue to have the social burden of making sure our repositories don't conflict with other peoples (usually the "official distro" repo - but when we're talking of unifying packages, that's a lot of repositories which could conflict. Unless we add our "vendor" flag into the mix, etc)

igravious · on April 30, 2015

That's all fine and well but the thing is -

Is it possible to unify package management un-intrusively or not? Call the un-intrusive path A nd the intrusive path B.

A) If we were to to it in such a way that did not require us to modify all the existing package managers out there then how do we do that? I think the solutions (a database of unique hashes and so on) you are coming up with might be a way forward. But don't we still have the problem of communicating with the underlying package managers?

B) What sort of intrusion? Would package managers have to adhere to some kind of standard or API or expose a minimum amount of surface area? How would you get package manager maintainers to sign up to something like that? A summit? Who would fund such a summit? Then we'd still need something to implement that API. Somebody elsewhere mentioned PackageKit, would this fit the bill? If not, is there something else that would? And would we still need to track installs and so on with something like what you're proposing.

Remember this would ideally would on Debian-like (.deb apt-get) and RPM-like (.rpm yum) and Gentoo (.pkg emerge) and so on and so forth ad nauseum ... I mean, I'm running Ubuntu so after the Great Request for the Unification of Managing Packages Summit (GRUMPS) I want to still be using Synaptic but I want to be able drill down into sub-package managers. See what I mean?

zokier · on April 30, 2015

Again I think you are overthinking this. It is perfectly fine for SEARCH to return multiple results, it is intended mostly for interactive use anyways. It won't impact dependency resolution in any way, that would continue to work as it does now, in other words each backend does its own dependency stuff without any regard of any other system.

pjc50 · on April 29, 2015

It's a disaster. The proper way to do it is distro packages, but for some reason every language, framework, ecosystem and individual developer wants to reinvent this particular wheel. I really don't understand why.

Goladus · on April 29, 2015

It's because individual tools usually don't want to be tied to assumptions made by one particular distro. I actively avoid using distro packages for 3rd party development libraries and such, especially when a good tool for accessing upstream sources (eg pip) is available.

I use packages for certain tools and platforms, and libraries if I feel the library is really something I want to be a standard part of the system environment. For example, I am more likely to use the distro package of a python library (if available) if I'm planning to use the library for a system administration task than if I am planning to use it for application development. I'm also likely to use distro packages for things like apache, nginx, postfix, unless I have some case-specific reason not to.

jfroma · on April 29, 2015

One technical reason is that I might use two different versions of the same library in different projects and apt-get only allows me to have one at the time. I think npm and gem are brilliant on this regard.

Best of both words: docker. I consider docker an application packager.

Goladus · on April 29, 2015

Docker is definitely one answer.

wldcordeiro · on April 29, 2015

Distro packages move slower than people would like so they create their own solution.

vetinari · on April 29, 2015

You know, it is still simpler to make your own deb or rpm, than entirely different package system.

It is more of a case, that these different package systems were introduced on platform that lacks native one. Then, by combination of laziness/not wanting to build another package and recycling the already built binaries, they got a traction on linux systems too.

And there is a reason, why distro packages move slower - people having that deployed in production do not like breaking changes. If you want bleeding edge packages, use bleeding edge repos.

eliaspro · on April 29, 2015

There's no need to unify the installation of those domain-specific package managers - those packages just need to provide proper machine-readable metadata to make it possible to build distribution packages out of them so there's no need to create another 10 package-datasilos outside of the system's package-manager's control.

The very idea of placing dependencies in a local location (or even worse in a global location but not controlled by the system package-manager) is so rotten from the core - it simply deserves to die.

What was once supposed to be a tool for developers to have easy access to dependencies has now crept into the area of operation and deploying uncontrollable stuff (Docker containers, 3rd party package managers, statically linked applications, manually built packages, …) into production systems has become the norm.

TazeTSchnitzel · on April 29, 2015

> PHP Pear

Actually, Composer is what's usually used these days. Composer is per-project (like npm), though, so there's no point unifying it into a global package manager.

igravious · on April 29, 2015

See, I didn't even know that about Pear and Composer. And I totally forgot to mention npm and bower from the javascript world, both of which I use.

brusch64 · on April 29, 2015

Have you ever tried ninite (https://ninite.com/) ? Not perfect at all - but at least it works for some of the freeware utilities / programs I'm using on my windows PC.

Simply run the generated exe to install the program. Run it later to update the program.

igravious · on April 29, 2015

Looks cool, but I meant for Linux specifically. Apple and Microsoft have ultimate control over their ecosystems when it comes down to brass tacks.

politician · on April 29, 2015

Give it about 6 months and someone will announce a container-based package manager.

igravious · on April 29, 2015

Hum, you could be right. I tried to install Discourse from source a while back on CentOS and gave up. A while later I tried the container method (Docker I believe) and bingo. So much magic going on under the hood though, dunno how I feel about it...

sz4kerto · on April 29, 2015

This kind of reminds me of just statically linking everything. We have not done that because it is a waste of resources but containers are all right then. :)

(OK, I'm aware of the differences, but I don't see containers as a salvation for client-side deployment.)

nfoz · on April 29, 2015

https://developer.ubuntu.com/en/snappy/tutorials/using-snapp...

I was informed about this just this morning.

vezzy-fnord · on April 29, 2015

There must be some way to _unify_ this proliferation of software update mechanisms.

There's PackageKit, though that's mostly used for creating those graphical software center/app store types of applications.

igravious · on April 29, 2015

Interesting approach. Link here:

http://www.freedesktop.org/software/PackageKit/pk-intro.html

Goladus · on April 29, 2015

> There must be some way to _unify_ this proliferation of software update mechanisms.

There are many ways. The problem is there are many different environments who all use different methods for different reasons.

To a solo developer with personal control over the entire stack, the Operating System is merely one more tool in the toolbox. That dev can pick any distro he wants and then install and configure anything he wants(as root). To a postdoc researcher in a lab using the University's shared compute cluster, leveraging the OS might not be such an obvious and easy choice. Then of course there is the whole issue that Wordpress, ruby, system software, and IDEs are all developed by very different groups of people.

But back to your question: configuration management tools can help with this problem. They do require some end-user investment at this stage, in no small part due to the reasons I identified above (there's not yet an ideal default that works for everyone).

Generally, configuration management tools encourage you to declare your software, modules, packages, requirements in an abstraction layer and then have the config management tool handle the messy details of whether to use apt or yum or gems. The catch right now is that you will typically have to do handle those decisions (to some extent) yourself.

Ansible:

http://docs.ansible.com/gem_module.html

http://docs.ansible.com/apt_module.html

http://docs.ansible.com/yum_module.html

Puppet:

https://docs.puppetlabs.com/references/latest/type.html#pack...

Chef:

https://docs.chef.io/chef/resources.html#gem-package

https://docs.chef.io/chef/resources.html#apt-package

https://docs.chef.io/chef/resources.html#yum-package

igravious · on April 30, 2015

Aha! Lightbulb moment. I thought that Puppet and co. (chef and so on) were about pure configuration across multiple machines. I didn't realise they were about deployment as well. In that case I guess the problem is sort of solved but then I need to start thinking a layer higher than I have been.

It would be nice if the solution to this proliferation still allowed me to think at the level of [synaptic/apt-get/dpkg] on Debian/Ubuntu/... or [yum/rpm] on Redhat/CentOS/Suse/... and so on. Do you think this is unreasonable of me?

lqdc13 · on April 29, 2015

They should just all be in PPAs on Ubuntu.