Reproducible Builds in Go

		Reproducible Builds in Go (go-talks.appspot.com)
		233 points by Spiritus on April 29, 2015 \| hide \| past \| favorite \| 69 comments

davecheney on April 29, 2015 | [–]

Hi HN,

There was a video recorded at this meetup, but I'm not in control of that.

In the mean time, the getting started document [1] is the contents of the "demo time" slide.

1. https://github.com/constabulary/gb/blob/master/getting-start...

joewalnes on April 29, 2015 | | [–]

Dave thanks for doing this. Not just for building the tool but for a very well thought out and convincing presentation.

I'm completely with you. My own solution has similar ideals (per project versioned deps) though a different solution. https://github.com/joewalnes/go-getter/blob/master/README.md. Honestly I'm just glad others are finally talking about the elephant in the room.

politician on April 29, 2015 | | | [–]

Thanks for building this; I completely agree that the `go` build tool is the problem, but settled on Goop as a stop-gap solution.

ghodss on April 29, 2015 | | [–]

One major limitation of this approach is that any project that wishes to vendor or lock their dependencies can no longer be used as a dependency for another project. From the gb GitHub:

> A project is the consumer of your own source code, and possibly dependencies that your code consumes; nothing consumes the code from a project.

This seems to imply that any code outside of a project (i.e. the code inside vendor/src) has no recourse for indicating the versions of their dependencies. This is nice in that it simplifies the problem, but to completely remove the ability for any and all libraries to indicate the versions of their dependencies seems unnecessarily restrictive. If I build a library for others to use, and I have a dependency, I want to be be able to lock to a specific version, or at least give my preference for a version.

Of course, this creates its own issues - what do you do when two libraries depend on two different versions of the same library? (Also known as the diamond dependency problem.) This is where the Go culture helps, where as long as you pick a later version, things are likely to work. But I'd rather have the tooling let me detect the two versions that the two libraries want, show that there is a mismatch, and give me the ability to override and pick one (probably the later one). Instead, the gb approach eliminates the ability for the libraries to even have the ability to indicate what version they would prefer, which makes it even more difficult to get a bunch of libraries that share dependencies to work correctly together.

godep (https://github.com/tools/godep) seems to have the best compromise: vendor dependencies without path rewriting (though with GOPATH rewriting), but also keep track of their versions in a Godep.json file. You can gracefully pick between conflicting versions upstream if need be.

eikenberry on April 30, 2015 | | [–]

+1 for godep w/o import path rewriting. It isn't addressed in the presentation at all, only using it with path rewriting is mention and all the problems associated with it are with the path rewriting.

IMO it is the best solution right now with just 1 issue, that it is not included with go. This is a pain with CI systems (e.g. Jenkins) where you have plugins to provide go itself, but have to figure out a way to get godep around to run your build. Right now I'm punting and just doing a go-get godep, then using it to build my project. I'm not happy with that though.

endymi0n on April 29, 2015 | | [–]

Sad to say so, but this really makes a lot of sense - and it's dead simple, something I really appreciate as a Gopher.

Suddenly you're able to have full control over a project, use private repos effortlessly (even with SSH protocol) and most importantly for me, not check in all the dependencies and their changes like with Godep, which just keeps up messing up the history although having nothing to do with the project.

Just Git submodules and you're done.

sagichmal on April 29, 2015 | | [–]

    > Just Git submodules and you're done.

It would be a disaster if popular Go projects started using submodules, and consequently required users and contributors to play the submodule init/update/sync dance. Please, please -- prefer subtrees.

davecheney on April 29, 2015 | | | [–]

gb doesn't mandate any particular DVCS or DVCS strategy -- I understand it's a deeply personal issue for most.

People are smart, if they like the tool, they'll figure out how to integrate it into whatever DVCS they use.

ansible on April 29, 2015 | | | | [–]

I prefer subtrees to submodules. Not that subtrees are a cakewalk, in fact I wrote up a tutorial using them. [1]

The main issue with submodules (for me anyway) is that it seems to be all too easy to revert a change in a submodule if you aren't careful. It is not very people-friendly because you're just looking at hashes, and it isn't clear which one is newer and which is older.

[1] https://github.com/jamesgraves/example-go-app

DannoHung on April 29, 2015 | | | | [–]

Submodules should be the right solution. Subtrees are weird as hell even if they require less interaction.

I wonder if anyone in the git team has considered doing any updates on the porcelain for them.

edit: BTW, even though syncs may be separate (and let's be honest, they should be really infrequent). You can do the update/init thing in one command: git submodule update --init --recursive

nulltype on April 29, 2015 | | | [–]

This seems weird to me. While it's true that I don't want dependencies to mess up my git diff, I can't say that a dependency change has "nothing to do with the project". A change in dependencies almost always means a change in the behavior of my project.

I guess what you're saying is that submodules means that the diffs are better somehow, but I can't remember ever enjoying using submodules. This whole thing seems more like a git problem than anything else.

cakoose on April 29, 2015 | | [–]

On slide 9 it says that enhancing the syntax for "import" would be a backwards-incompatible syntax change. Sure, old compilers wouldn't accept newer code, but newer compilers would accept old code without breaking the behavior.

Is Go's definition of "backwards compatible" different from the usual one?

politician on April 29, 2015 | | [–]

Slightly OT, but it's kind of strange to suggest enhancements to the import statement when the spec explicitly states that an import statement contains an opaque string.

It's the go build tool that add semantics to the values of the import statement, not the language.

laumars on April 29, 2015 | | | [–]

The issue is more with older code not compiling on newer compilers if the syntax is changed. Go works on the basis of no breaking changes between major versions.

I guess theoretically it could be changed in such a way that both the older and newer syntax would be valid on newer compilers, but that adds additional complexity which is also an anti-Go idiom.

So sadly we're left with two alternatives: 1) code rewriting / generation, or 2) 3rd party compilers.

cakoose on April 29, 2015 | | | [–]

But it looks like Go has done similar things in the past. For example, the slicing operation gained an additional optional argument (https://golang.org/doc/go1.2#three_index). It's not exactly the same thing, but it seems roughly in the same league of difficulty.

Based on what I'm reading, versioning is a serious pain point. Sure, you always have to make sure the payoff is worth the additional complexity (universally, not just in Go) but the author seems to categorically disqualify this option by incorrectly saying that it would break backwards compatibility.

laumars on April 29, 2015 | | | [–]

> But it looks like Go has done similar things in the past. For example, the slicing operation gained an additional optional argument (https://golang.org/doc/go1.2#three_index). It's not exactly the same thing, but it seems roughly in the same league of difficulty.

It's not really same in my opinion. Slices already accepted multiple parameters - or even none. So adding another optional parameter doesn't affect the syntax. Where as import names are all stored in a single quoted string, so you'd have to rethink the entire structure of the import syntax (which leaves you with two import different methods of expressing imports - which is messy) or use some nasty kludge of including versioning within the import string.

What's more versioning is only half the story here. There's other issues with Go's imports that this author aims to resolve - such as the dependency on a static repo. Because import paths are based directly on the repo path (eg github.com) it makes it a pain to prototype code in a private repository before pushing changes public. And then there's issues with dependency on any specific resource (you can't use mirrored repos with go get), plus a whole slew of related issues regarding hard-coded repo paths.

So versioning is only half the story.

> Based on what I'm reading, versioning is a serious pain point. Sure, you always have to make sure the payoff is worth the additional complexity (universally, not just in Go) but the author seems to categorically disqualify this option by incorrectly saying that it would break backwards compatibility.

Well I've already argued that any clean solution would break backwards compatibility and any kludge would be messy and rather short sighted (plus potentially may not address the other issues with go get and import()).

So I think it's a little misjudged for you to say he's "incorrect" in his opinion in the way that you have.

noselasd on April 29, 2015 | | | [–]

From what I gather the commenter is saying that a new compiler could accept this syntax:

      import "github.com/pkg/term" "{hash,tag,version}"

And this (old) syntax:

       import "github.com/pkg/term"

Meaning the language would be backwards compatible. But you're saying that this wouldn't be backwards compatible - why ?

laumars on April 29, 2015 | | | [–]

Your method doesn't follow the same format as the rest of Go's syntax (eg comma delimited lists) which could break any 3rd party pre-parsers or other such tooling. So there's the potential break backwards compatibility there.

Plus lumping everything within the second set of quotes means you don't have idiomatic styling nor a simple method for the gofmt to validate since you have a list handled as a string. So it's not really a clean fix in my opinion (which was one of the other points I raised).

Whatever we bake into the language will be set in stone for all of the foreseeable future - regardless of whether it's a good change or bad change. So I tend to think it makes more sense to use 3rd party tooling to address this problem for now, and then bring whatever method seems to work the best into the official language specifications for Go 2.0. Other people's opinions might differ, but that's why I personally prefer the approach this author it taking.

cakoose on April 30, 2015 | | | [–]

> Your method doesn't follow the same format as the rest of Go's syntax (eg comma delimited lists) which could break any 3rd party pre-parsers or other such tooling. So there's the potential break backwards compatibility there.

The new slicing syntax could have broken 3rd party pre-parsers or other tooling. Yet that change was considered backwards-compatible.

laumars on April 30, 2015 | | | [–]

It could have done, but there's fewer tooling that parse slices (particularly at that point in time) than there are which parse import strings (there's lots and lots of tools that already do that!).

However the other key point is one I made earlier: slices already support a number of optional parameters so the language semantics didn't change even if the syntax of slices were expanded. adding 1 additional optional parameter to a property that already supports multiple optional parameters really isn't comparable to the import example which had a list of values represented in a new and completely non-idiomatic way (Ie initially without punctuation and then nested inside a string. Nothing else in Go follows that pattern, let alone the import strings already).

YZF on April 29, 2015 | | [–]

The link isn't working for me.

What we do is keep a copy of all packages we use and then assemble everything into a workspace as part of the automated build. Nothing is pulled from GitHub at build time. All those copies are versioned so you can have different projects use different versions. The dependencies are specified via some pre-existing build infrastructure but essentially there is one file in the project listing all the dependencies and where to find the packages.

Developers can still have their own personal workspace and build with different versions but the "real" build is always reproducible.

I think many people using Go or wanting to use Go in commercial/production environments are finding a little bit of impedance mismatch between the Go view of the world (the workspace and packages) and what their existing tooling does. It's not too terrible to deal with but it would be nice if we had a little more flexibility. go get could support some nicer/automated version of the above where you have the ability to download/mirror packages to a shared drive or to another source control system and pull from there to your workspace based on specific dependency versions. This is fairly minor friction though.

Aissen on April 29, 2015 | | [–]

Got my hopes up for deterministic builds, but then I read on slide 5:

     Out of scope: compiler doesn't produce byte for byte comparable binaries

davecheney on April 29, 2015 | | [–]

There are others working on that problem.

I probably shouldn't have mentioned it; if I hadn't, you wouldn't have thought about it, and then nobody would have had their hopes dashed.

Aissen on April 29, 2015 | | | [–]

English isn't my first language, so when I read "reproducible" I thought "deterministic"; I also didn't think you'd be working on the "vendoring" problem since I thought it was a solved issue (which it isn't for all the reasons in the post). Anyways, thanks for your work.

davecheney on April 29, 2015 | | | [–]

I think the compilers that produce elf output are almost deterministic, at least when producing a fully static binary (no CGO).

I don't think that holds for mach-o or PE binaries, but I'm really speaking off the cuff here, as I said, others care about this and are working on it separately.

ComputerGuru on April 29, 2015 | | | | [–]

However, English is my first language, and I thought exactly the same.

silon3 on April 29, 2015 | | | | [–]

Does JIT (or late/deploy time optimize/link) help here? Would it be easier to make deterministic builds when optimizer is not involved?

davecheney on April 29, 2015 | | | [–]

Nope, this is about the source that goes into the compiler.

AnkhMorporkian on April 29, 2015 | | [–]

I love the presentation, but the GIL problem and Go's package problems aren't really even comparable. The GIL is an unfortunate consequence of the CPython implementation that can't really be solved without some serious hacks or computationally expensive workarounds. The Go "problem" is almost entirely an implementation problem; something evidenced by a simple fixed implementation.

Honestly, the entire presentation was nearly lost on me by the weird comparison at the beginning. I recovered, but seriously, very strange comparison for anyone who understands what the GIL represents.

Edit: And really, was github the best target to go to for availability when it came to developer downtime on the edited XKCD? Github's downtime is pretty near 0. I think I remember roughly an hour of downtime in 2014 in the super early morning, and nothing before that until 2012 maybe.

shadowmint on April 29, 2015 | | [–]

To be fair, the comparison isn't completely off.

The GIL was an implementation detail that was initially considered not a particularly big deal, but turned out to be a Really Big Deal later because no one thought about the problem from the beginning.

The lack of a package ecosystem for go and 'go get is good enough' is exactly the same; it was for quite some time simply considered to not really be a big problem. ...but it turns out, when you're doing complicated things, not having repeatable builds really is a Big Problem.

...but yes, they're in different leagues. Solving this one won't be anywhere near as troublesome.

davecheney on April 29, 2015 | | | [–]

Everyone knows the GIL as Python's big problem; dependency management is Go's big problem -- it's an analogy, that's it, nothing more.

davecheney on April 29, 2015 | | | [–]

> And really, was github the best target to go to for availability when it came to developer downtime on the edited XKCD? Github's downtime is pretty near 0. I think I remember roughly an hour of downtime in 2014 in the super early morning, and nothing before that until 2012 maybe.

There was something about 2 days of DDOS earlier this year, but again the specifics aren't important -- the important part is, if you are in charge of delivering a product written in Go, you're going to look like an ass if your build fails because some random part of the internet is having a bad day.

muraiki on April 29, 2015 | | | [–]

I interpreted the comic more as a statement about how important Github is to development than that it actually goes down all the time. So if it were down, that'd be a fine excuse to do nothing. :)

davecheney on April 29, 2015 | | | [–]

The important message on that slide is not the picture, but the words below it.

vanmount on April 29, 2015 | | [–]

I'm very excited about GB. GVM made working on multiple projects already a lot easier, but dependencies are still a huge pain... I don't know if GB is the best answer to the problem, but at least it's a big step forward and I'll definitely try it out.

eternalban on April 29, 2015 | | [–]

Go would hugely benefit from a hierarchical /.go in $GOPATH. This would allow for a robust versioning scheme, and allow the language to scale in the future.

davecheney on April 29, 2015 | | [–]

Can you explain your idea in more detail ?

eternalban on April 29, 2015 | | | [–]

Think /.git, Dave.

   /.go
       vendors # map simple import name to explicit (ugly) path
       ... # other stuff for additional e.g. generate features

Hierarchical: Think OO-ish shadowing of .go directives and metainfo.

Is this clear?

davecheney on April 29, 2015 | | | [–]

> Is this clear?

not really

eternalban on April 29, 2015 | | | [–]

    # ----------------------------------
    # general layout of the land
    # ----------------------------------
    $GOPATH/src/.go
    $GOPATH/src/.go/config
    $GOPATH/src/.go/vendors # for all projects
    
    # ----------------------------------
    # $GOPATH/.go/vendors sketch
    # ----------------------------------
    
    lib/mq = github.com/lib/pq@master 
    mgo = labix.org/v2/mgo
    ... etc
    
    # ----------------------------------
    # specific project 1
    # uses the global version of package labix.org/mgo
    # uses a project specific package hotness
    # ----------------------------------
    
    $GOPATH/src/mygofoo
    $GOPATH/src/mygofoo/foo.go
    $GOPATH/src/mygofoo/.go
    $GOPATH/src/mygofoo/.go/vendors
    
    # ----------------------------------
    # foo.go fragment
    # ----------------------------------
    
    package mygofoo
    
    import (
        "mgo"       // version spec'd in global .go 
        "lib/pq"    // version spec'd in project specific .go
        "hotness"   // version spec'd in project specific .go
        ...
     )
    ...
    
    # ----------------------------------
    # $GOPATH/src/mygofoo/.go/vendors sketch
    # ----------------------------------
    
    lib/mq = github.com/lib/pq@experimental // lets pretend
    hotness = bitbucket.org/wiz/hotness@master
    ... etc
    
    # go tool
    
    > cd $GOPATH/src/mygofoo
    > go build -vendors
    using:
        mgo -> labix.org/v2/mgo
        lib/pq -> github.com/lib/pq@experimental
        hotness -> bitbucket.org/wiz/hotness@master
    >    
    
    # go tool can obviously allow for command line mods of the .go

iddqd on April 29, 2015 | | [–]

I think I'll stick to using submodules and a Makefile with GOPATH set to the vendor directory. Hasn't failed me yet.

tshannon on April 29, 2015 | | [–]

Might be overkill, but for my large projects, I simply host my own gogs [1] instance and vendor in all my dependencies to that instance. I get reproducible builds, and I can bring in any updates I want manually when I'm ready for me.

1.http://gogs.io/

humanfromearth on April 29, 2015 | | [–]

TL;DR: Don't use `go get` if you want reproducible builds.

It's early for me to tell, but looks like gb is something like a mix between a gvm (virtual env) and godep (vendorizing).

I'm skeptical about things like "don't use the standard, use my thing", but I will try it anyway.

sagichmal on April 29, 2015 | | [–]

The author has reasonable credibility/authority in the community. Definitely worth a look.

caiusdurling on April 29, 2015 | | [–]

Reminds me of developing rails apps way back in the day before RVM gemsets or bundler came along. You'd vendor all the gems (& specific version for your app) into `vendor/gems` ala http://nubyonrails.com/articles/2005/12/22/freeze-other-gems...

vidarh on April 29, 2015 | | [–]

I still vendor all the gems, only I use Bundler to do it.

It's pretty much crazy not to, in my opinion. Stuff gets removed, and networks are unreliable, so locking/resolving dependencies to specific versions is insufficient.

caiusdurling on April 29, 2015 | | | [–]

Sorry yes, I meant it reminded me of vendoring gems _by hand_. I think everyone is vendoring gems via Bundler these days, because it would be insanity not to!

shocks on April 29, 2015 | | [–]

It'd be nice if we could somehow use semver. I'd like to get patches automatically...

edit; if you downvoted, maybe comment and explain why?

drewolson on April 29, 2015 | | [–]

How is this different from godep[1] other than being the root of its own GOPATH and having a different naming convention for the location of vendored dependencies?

[1] - https://github.com/tools/godep

humanfromearth on April 29, 2015 | | [–]

@davecheney

Thanks for your work.

If I publish my project on github and use gb with it, then people who want to contribute have to install gb as well.

Do you think that in the future there is a chance to do the building of a gb project without the gb itself?

davecheney on April 29, 2015 | | [–]

Maybe, but it's not something I'm focusing on.

stormbrew on April 29, 2015 | | [–]

If you're going to vendor everything anyways, I don't really see why you need a new tool. But IANAG, so maybe I missed something?

davecheney on April 29, 2015 | | [–]

It's vendoring without rewriting, http://go-talks.appspot.com/github.com/davecheney/presentati...

aikah on April 29, 2015 | | [–]

Yet another problem the go team refuses to see. Either the go team says "not our problem" then people are stuck with doing things multiple and incompatibles ways , or the go team says "we already solve the problem, vendoring" and the go team doesnt understand the problem at first place. That's the recurrent pattern with the Go team. Everything cannot be dealt with in userland, or you'll end up with fragmentation. And No , vendoring doesn't solve the problem.

nothrabannosir on April 29, 2015 | | [–]

But... almost every sentence in that comment is "not applicable" at best.

* the major players all solve this specific problem outside the lang core

* Dave Cheney is so involved with the project, you could easily consider him part of the "go team." Both in contributions and community pull.

* This very article is not about a problem, it's about a solution. It's literally Dave Cheney saying: wow, look at this important problem, and look at all the reasons why it's important. Here is a solution and all its pros and cons. Regardless of how you feel, the timing of your comment simply could not have been further off.

* More generally: Go is one of the most batteries-included languages I know. From the stdlib to the tooling (which is still part of the official release), it's all in there. When did you last try to create platform and architecture independent binaries without any prior knowledge in, I don't know; C? Even Python?

I really want to like this comment, believe me I do. But it is so out of tune with the article, I can't.

This article is almost literally about the opposite of everything you just wrote.

LamaOfRuin on April 29, 2015 | | | [–]

> * Dave Cheney is so involved with the project, you could easily consider him part of the "go team." Both in contributions and community pull.

Wait, is he not officially part of the go team?

PopsiclePete on April 29, 2015 | | | [–]

By "Go team", do you mean "Salary paid by Google"? Then no, I believe he works for Canonical...but he's pretty "core" in all other aspects.

davecheney on April 29, 2015 | | | [–]

Honest question: Why does vendoring not solve the problem ?

kungfooguru on April 29, 2015 | | | [–]

I bet it works great for Google or anyone else who keeps everything in one big repo and isn't developing open source applications.

Titanous on April 29, 2015 | | | [–]

Vendoring works great for us, we keep everything in one big repo and it's open source. The problem is that the current tooling for managing this is horrible.

nulltype on April 29, 2015 | | | [–]

What is horrible about managing it? The biggest problem I encountered with vendoring is getting rid of git submodules.

Titanous on April 29, 2015 | | | [–]

We currently use godep, and it breaks very easily due to tags and differing architectures and OSes (among other things). In addition to being unreliable, it's pretty slow. I'm not aware of any other tools that do import rewriting+vendoring and work better than godep.

nulltype on April 30, 2015 | | | [–]

Oh, do you check in the rewritten packages into git?

Titanous on April 30, 2015 | | | [–]

Yeah, that's generally what "vendoring" means.

pas256 on April 29, 2015 | | [–]

The number of parallels between Go and Ruby keeps me entertained. Isn't this just rvm and bundler all over again?

duaneb on April 29, 2015 | | [–]

Well, thankfully there's nothing like the nightmare 1.8.6 -> 1.9 switch, and it has loads of tools working with static code that ruby could never have.

ahmetmsft on April 29, 2015 | | [–]

Can somebody please share a mirror? App Engine instance seems like it has exceeded the quota. (lol)

GutenYe on April 29, 2015 | | [–]

yet another try......

_ak on April 29, 2015 | | [–]

I attended the talk. Wait for the video, I personally found the Dave's arguments and reasoning convincing.