Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Please stop breaking the build (danluu.com)
72 points by luu on March 15, 2015 | hide | past | favorite | 42 comments


This isn't mentioned until later in the article, but it seemed important.

> The worst thing about regular build failures is that they’re easy to prevent. Graydon Hoare literally calls keeping a clean build the “not rocket science rule”, and wrote an open source tool (bors) anyone can use to do not-rocket-science.

http://graydon2.dreamwidth.org/1597.html


A member of the Rust community, barosl, wrote Homu (bors 2.0) which Rust has been using in place of the first bors for a few months now. It's even better than the already-great first version! I'm seriously considering setting it up for my own projects, since I believe it has support for using Travis CI as a testing backend, in addition to buildbot (which Rust and Servo both use).

https://github.com/barosl/homu

Example interaction: https://github.com/rust-lang/rust/pull/23381#issuecomment-80...

Docs: http://buildbot.rust-lang.org/homu/


Interesting. For the D programming language we use something similar but a little different: once a reviewer approves a pull request, they toggle an "auto-merge" flag in our CI tool, and a bot merges the pull if all tests pass (after merging with master, of course).


Which has been doing amazing things for Servo and Rust for years now!


I'd very much like to have some more information about how the article author gathered this data - are these all master branches of said projects, numbered releases?

How does the data account for variation in development procedures (e.g. some projects use master as their bleeding edge branch)?


  How does the data account for variation in development 
  procedures (e.g. some projects use master as their 
  bleeding edge branch)?
That shouldn't matter. You should always be able to successfully check out, build, and pass all tests on master. Otherwise, how are people expected to get development done? If you aren't able to build and run tests because you need to be fixing something that someone else committed broken, it will be a lot harder to ever make real progress.

Even if it's "bleeding edge", master should always build and always pass at least the minimal "fast" test suite; some projects may have more thorough "full" test suites that take many hours to run, and requiring that those be run on every change would slow down development too much, but at the very least, the quick "sanity check" test suite should always pass before something can land on master.


Also consider that if its checked on a commit by commit basis that doesn't collapse merges in somehow, then you actually check every commit that was made during incremental work on a different branch. If you're doing TDD or any test first methodology, that's going to often mean you intentionally commit broken tests to fix them.

Also if its a commit by commit basis, the percentage of time comparison is completely invalid.


The author is talking about broken builds, not broken tests.


In this context they're synonymous. A broken build is defined by the tests failing.


The data is not really relevant for the reasons you state + failed build does not equal published artifacts. How long were the builds in that state? etc.


    Web programmers are hyper-aware of how
    10ms of extra latency on a web page load
    has a noticeable effect on conversion rate
I wish this was true, but I'm really skeptical that the average "web programmer" even uses the phrase "conversion rate" on a monthly basis.


I know neither yours nor my comment strictly related to the topic - which is about not breaking builds, but the effect of latency on a web page only has a tiny contribution to conversion rates or other user satisfaction metrics as compared other factors - like stability and features. Hence, I think it's not _important_ for most web programmers to be hyper-aware of the latency - as long as it's reasonable.


Is 99.9% or 3h/month really the "professional" norm?

I remember a larger project where we struggled for a longer time to anywhere beyond 20%. Eventually we got a lot better but nowhere near 99%.

I also wonder whether it is worth it? Are all developers really blocked when the build fails? Some yes but if I could I would design the production chain that most would not be.

Isn't there a trade-off in how much to invest into build tooling and automation vs. in functionality? Does building a MVP include building a perfect software production pipeline?


> I also wonder whether it is worth it? Are all developers really blocked when the build fails?

In our setup building the deployment artifacts is contingent on tests passing so if the build isn't passing, no one can deploy.


But you could build and test the candidate master branch before it actually becomes the master branch...


Three nines is way beyond what I have seen. At most places I have worked, the build was broken for an hour or two each day.

There was one place where the official builds had been broken for so long that people stopped paying attention to them. The new dev director put a stop to that, to his credit.


Wait, what? How? Can't you just build things before you merge them into master?


Welcome to distributed programming. Just because it works for one developer doesn't mean it works for the others, at least not until all the conflicts are resolved. One could have build systems which didn't permit a change to commit in the master branch until everything compiled and all tests passed, but Github isn't set up that way.

The build may also break because of a change in an external dependency.


> One could have build systems which didn't permit a change to commit in the master branch until everything compiled and all tests passed, but Github isn't set up that way.

If you use pull requests instead of commits, you can set up such a system to automatically run make test.


> The build may also break because of a change in an external dependency.

It shouldn't unless you've been lax in how you specify your dependencies, or the library maintainers have been sloppy in what they claim for backwards-compatibility.

Obviously in the real world, both of those things happen.


Package repositories go down too (Maven, RubyGems, NPM, etc...).

Unless by 'lax' you mean 'didn't check in your dependencies'...I don't think I'd call that lax, though.


At my work, we run a local mirror/proxy for dependencies. It's polite because we're not hitting those external services as frequently, and it has the advantage that we aren't reliant on them to produce builds.


Am I missing something, or isn't this problem easily fixed by enforcing testing before push? As long as the appropriate tests are available, seems like this could be solved


It's not necessarily that simple, but it is simple. You need to do two things:

1) Integrate the master branch (or whatever your guaranteed-good branch is) with the code you're about to push. This prevents integration conflicts from causing the build to fail.

2) Test it on a reference machine. This prevents environment assumptions from causing the build to fail. (Such as installing new software or setting an environment variable, but forgetting to make it part of the build.)

These are both easy to do. My preference is to push to a testing branch on the integration machine, merge in the master branch, run the tests, then merge the testing branch back into the master branch. (There's a bit more to it than that, to cover edge cases, but that's the gist.)

Sadly, most teams and CI tools aren't set up to do this--although, as the article says, it's not rocket science. In fact, I'm surprised it's not obvious that you should do it this way.


Doesn't help if eg you have multiple reference platforms - many open source projects support a range of platforms (eg OSX, Linux, Windows) - much harder than in house software that has a standard platform. You really have to stage commits then.


While both of those are nice, even doing something as dumb as having a git pre-commit hook that runs 'make' (or whatever is the equivalent) catches an amazing amount of errors.


This was mentioned as a solution in the article.

http://graydon2.dreamwidth.org/1597.html


I agree. I thought that this was the whole point of having a CI system.

I don't have a huge amount of experience in open source projects, so maybe they do it differently. Anywhere I've ever worked, not breaking the build and not causing regressions were a prerequisite to getting any pull request serviced. That's how you keep the master from breaking.

On the other hand, building a testing in a clean environment costs money. A company will pay for server time if they believe it's cheaper than developer time (it is). Maybe OS projects just don't have those resources.


The only way I have been able to get away from Ben's law is to have an automatic system that does builds before code can get in. This can be server side, user side, pre-commit hooks, pull-request bots or anything that catches build breakage before they can get into the main branch. But without something automatic in place the project is destined to fall to Ben's Law.

Ben's Law - When every developer is committing to the same branch the odds that a commit will break the build increases as more developers contribute to the project.

http://benjamin-meyer.blogspot.com/2014/01/bena-law.html


It's not completely clear to me how the build start times were used to calculate uptime. Especially since a popular usecase for Travis/github integration is to trigger a build-upon-PR. If that triggered build fails, the PR is marked as such and normally the content wouldn't get merged. This is designed to prevent the exact problem described, but are these Travis PR-triggered builds filtered out of the uptime input dataset?


Why builds break on our team? Because we have no idea how to add precommit hook to perforce intellij client so it tries to build and test the whole thing locally before pushing changes to repo.


Why not have local pre-commit hooks? Even if you are not using git-p4 clobbering some sort of frontend tool to p4 submit that first runs make etc that everyone would use would be a big win for your team.

http://benjamin-meyer.blogspot.com/2010/06/managing-project-...


Everybody uses intellij. We couldn't find ability to add local precommit hooks there and we didn't have resources to build our own custom tool for such basic operation.


You are using perforce so just write a bash script called p4 that if you are doing a commit first does checks followed by the real commit and otherwise passed every command to the real p4.


Your problem reminds me of the "Why would you use Emacs over intellij thread".


and your team isn't disciplined enough to do it manually ("nothing is stopping us" is an explanation why things happen sometimes, but it is a bad excuse)


As I heard it, one of the late-nineties projects at Microsoft had a big problem with build breakages. They finally instituted a policy of tracking down every build breakage, and conducting a little humiliation ritual of awarding the offender a "big sucker" award at the weekly project meeting. This award had to be displayed on the developer's office door for a week or something.

You could dodge the award even if you did break the build by showing that you had run the full unit test suite before you submitted your code.


Yeah. We considered that but that's horrible social solution for purely technical problem. It's like beating your child so he won't put fingers in the wall socket instead of childproofing it.


One shouldn't make a big deal about it, but something like "whoever breaks the build on an important branch (and doesn't immediately remember and reverts it) brings cookies/makes a coffee run/... for everybody in the same office" can work if it happens to often. You are right, occasional mistakes happen, so it has to be something simple.


Personally I don't think any team is disciplined enough to ALWAYS do it manually and the con-founder of "it works on my machine".


I wouldn't say undisciplined, just human. If you can not do something you from time to time won't do it.


I'd love to see the numbers on Haskell. I believe (and would hope) that it would fare well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: