Tracking down a kernel bug with git bisect

tlrobinson · on July 23, 2014

The first time I used "git bisect run" I felt like I was cheating. Unfortunately (or fortunately) I don't have many opportunities to use it.

chriscool · on July 24, 2014

There is this LWN article (that I wrote about it):

http://lwn.net/Articles/317154/

tieTYT · on July 23, 2014

One thing that seems difficult with bisect are these series of steps:

  1. "Hey... this bug didn't use to happen.  Where'd that get introduced?"
  2. OK, this unit test will tell me when the bug is fixed.
  3. Now lets automate bisect to tell me where this 
     test first failed even though I just wrote it.

How do you do that? I think as long as you DON'T commit your unit test, bisect will carry over to every commit check. But, you'll have to make sure you commit all your other changes or else they'll be carried around, too: Conflict-city.

Also, if you do commit your unit test (a "bad" habit of mine) I have no idea how I'm supposed to work with this. I end up copying the unit test by hand to each commit it tries to test.

In summary, it seems like bisect was built with manual testing in mind. I know you can automate running a shell script, but I don't write my automated tests in shell script.

EDIT: Keep in mind, not all projects are interpreted. Some are compiled.

kazinator · on July 23, 2014

I've successfully used "git bisect" in situations in which, as part of every test, I had to "git stash pop" some local changes, then do the test, then remember to "git stash save" them before continuing to bisect. (Alternatively, "git apply" the stash, then "git reset --hard" to blow away the changes.) That's how you can handle situations where test code has to be added to the program to demonstrate the problem, and that test code doesn't exist in old versions.

tieTYT · on July 23, 2014

Yeah, not a bad method. And I suppose if I've already commited my unit test, I could use a "git reset --soft" to put the test in a stash. Next time I use bisect, I'll try to do this. Thanks

ajross · on July 23, 2014

The whole idea of bisection is that you have an objective behavior that has changed. If the test (which is, by definition, the tool used to measure that objective behavior) is not constant across the range of changes, then you simply don't have a problem that can be solved by bisection. You're breaking rule one ("This bug" isn't a single behavior).

Obviously in practice you'd probably just use whatever the most recent version of the test is.

Really I think you're problem is that you're overspecifying rule 2. It's not about identifying a "unit test" per se, it's about identifying any specific process to detect the bug. If the tests in your tree work, great. If not, you may have to do a little development to make one that is automatable.

taeric · on July 23, 2014

I think you misunderstood. The test is brand new. As in, you find a problem that nobody has identified before and want to see how/when it was introduced.

Easy way to do this is put your test external to the tree. Then, bisect as normal for git, and run your test.

This is probably not uncommon for kernel tests, as you often have a test which could be "does this random laptop resume from sleep correctly?" Not really easy to automate that. And doesn't live in your tree.

Hello71 · on July 23, 2014

> "does this random laptop resume from sleep correctly?" Not really easy to automate that.

that's what rtcwake is for

taeric · on July 24, 2014

Right, my point was more that often everything will be working great for the main developer's laptop. I did not think it was uncommon to have someone come out of the weeds with a specific laptop that doesn't work. Or worse, it works if you sleep using the "sleep" button, but not if you close the lid.

So, yes, you may be able to automate it later, once you fully understand the specifics that cause it to not rewake. Possibly specific to how it went to sleep with the lid close. Until then, it is nice to be able to do ad hoc tests when necessary.

No1 · on July 24, 2014

In the past, I've committed the unit test like you, branched, rebased the unit test back to some appropriate point in time, and then run git bisect. This avoids applying/resetting at each step, and allows for some trickiness like changing your test to deal with an interface that mutates over the range of commits you're testing. The hash of the errant commit will be different after rebasing of course, but it's easy enough to match it up with the one in the original branch.

I wish there were a guide or some official "this is how to manage your git repository along with your tests so bisect will work wonders" - IIRC, bisect gets tripped up by commits that won't build and also some merges... would be nice to know what exactly to do about those once they're in your history.

kevinmcconnell · on July 23, 2014

In cases like this I usually copy the relevant test file to somewhere outside the repo, and refer to that path directly in the bisect cmd. That way you don't have to worry about anything conflicting with the new test that you need for the bisect.

E.g. for a Rails app, I might end up using a command like `rspec ~/Desktop/my_new_test_spec.rb`.

Tyr42 · on July 23, 2014

I think it works ok if you make a new branch from a known good commit, add the test, then rebase the new changes on top. You could do that with

git checkout -b testing git rebase -i HEAD~50 (or whatever) move test commit to bottom, then save.

tieTYT · on July 23, 2014

That's a great idea. I'll do this next time.

The best part is I should be able to resolve any conflicts before I start bisecting.

Where's the "accept as the right answer" check mark? :)

lambda · on July 23, 2014

You generally create the test in a new file that won't be affected by checking out older code. If you need to change some part of your build system or test system, have the test be a patch that can be applied, run the build, run the tests, and check the results.

As far as having other uncommitted changes; just stash those. "git stash" is a tool you should use very often for any time you have uncommitted work that you need to clear out before doing some other git operation.

subleq · on July 23, 2014

> I know you can automate running a shell script, but I don't write my automated tests in shell script.

Your tests aren't a shell script, but certainly you can compile and run your tests from a shell script.

lpgauth · on July 23, 2014

You don't need to commit your unit test... you can run any script from any path.

git bisect run my_script arguments

kev009 · on July 23, 2014

Sounds challenging. If it's something that happened a lot, you could keep the test suite in a separate repo with API levels, such that tests need newer APIs go in a higher numbered folder (think database migrations in web frameworks) and create a simple run script.

umanwizard · on July 23, 2014

The idea of bisect is that you only need to do log_2(n) steps to find a bug in "n" commits. Even if you have to do manual testing, it's still normally pretty fast, unless you have hundreds and hundreds of commits.

tieTYT · on July 23, 2014

You're not considering the time it takes to manually test.

  * How much time does it take to do a clean build?
  * How much time does it take to do a deploy?
  * How much time does it take to reproduce the issue manually?

Add those together and multiply them by the number of commits bisect makes you check. It can add up.

umanwizard · on July 23, 2014

Absolutely, and I'm not claiming that bisect is useful for debugging every regression. But it's saved me a LOT of time on a number of occasions, and shouldn't be dismissed out of hand just because it relies on manual testing. Even if going through 6 or 7 bisect revisions takes a few hours, it can sometimes be completely worth it.

MatmaRex · on July 23, 2014

Perhaps you could commit the unit test in a separate commit, then `git cherry-pick` it as a part of the script?

Arnavion · on July 23, 2014

Will bisect not get confused if you change the commit range it's testing as part of running it? Also cherry-picking might still generate conflicts.

kazinator · on July 23, 2014

No: you would cherry pick the change needed to bring in the test code, run the test and decide whether it is "good" or "bad". Then before running "git bisect good/bad" you would eliminate the cherry-picked change with "git reset --hard HEAD^".

CUViper · on July 23, 2014

You can also "git cherry-pick -n" to only apply it to the working directory, with no commit, and then "git checkout -f" is an easy cleanup.

amluto · on July 23, 2014

A lot of the time spent mucking around with QEMU can be avoided by using a tool like virtme:

https://git.kernel.org/cgit/utils/kernel/virtme/virtme.git

CUViper · on July 23, 2014

One tip for the module configuration: kbuild has "make localyesconfig" which looks at your currently-loaded modules and converts them to CONFIG_FOO=Y builtins.

See also "make help" for more available targets.

Arnavion · on July 23, 2014

It's a pity that while the bisect did point in the general direction of where the bug was in 3.15, it didn't actually find the right bug (the one it found had already been fixed and a different one introduced that led to the same errant behavior).

With something as complex as the kernel, I also wonder about automatic bisect being led astray by unrelated changes. The bisect script might fail for a different reason and zone in on the wrong commit.

chriscool · on July 23, 2014

The bisect pointed at the right patch series which is already very nice.

If the bisect script fails because of another problem, it is often possible to refine the script, or run bisect again on a fixed up branch.

Arnavion · on July 23, 2014

If I'm not mistaken, it didn't find the right patch series. The article says the function changed in 3.15 and the underlying reason for his issue was different and not part of that series.

seabee · on July 23, 2014

When you update the kernel and your program stops working, there are two possibilities:

1) it's a kernel bug; there is a regression in the kernel's documented behaviour

2) it's a bug in your tool because it took advantage of undefined or undocumented behaviour

The bisect found exactly the right patch series, but bisect can only answer the question 'when did it stop working', not 'why'.

You can read about many instances of 2) on Raymond Chen's blog The Old New Thing.

josephlord · on July 23, 2014

Good article. It might be worth considering doing git bisect ignore if a build fails. I suspect (not sure it might just halt) at the moment it would result in a git bisect bad so might give you a wrong change commit.

chriscool · on July 24, 2014

You mean "git bisect skip". There is no "ignore" sub command.

josephlord · on July 24, 2014

Yes, faulty memory.

incision · on July 23, 2014

Another article on bisecting which I bookmarked last year, but with physical hardware [1].

1: http://rtg.in.ua/blog/kernel-bisect-pxe/

torrent-of-ions · on July 24, 2014

> there were close to 15,000 commits, which seemed like a large space to search by hand.

Does the author not understand binary search? It would take about 14 tries to find the failed commit.

laichzeit0 · on July 24, 2014

I have no idea why people are downvoting you. Sigh.

A comment as to why you're wrong by the downvoter(s) would be infinitely more helpful.

philh · on July 24, 2014

I downvoted for tone, not because ve's wrong. I'm reasonably confident that someone who can write this blog post, understands binary search. I wouldn't have downvoted a post that simply said, for example:

> Binary search over 15,000 commits would take about 14 tries to find the failed one.

although I don't think that's an important point to be making. (As erikb said, that's still a lot of commits, and that was only the first run.)

erikb · on July 24, 2014

Building 14 kernels and checking if they contain the bug manually sounds to be a lot of work for me. In that regard I think his post might not be a reasonable criticism. I think that criticism and being wrong is everybody's right, though. Therefore I wouldn't down vote a post for something like this but others seem to disagree.

snorkel · on July 23, 2014

Interesting way to use to git to find a bug by process of elimination.

djhoffma · on July 23, 2014

This is part of why the Linux kernel needs a kernel debugger. I've worked with other kernels where we can actually observe what is going on using tools (DTrace on Illumos), and problems like this become something that can be tackled directly, without having to go through all this nonsense.

CUViper · on July 23, 2014

kgdb, kdb, systemtap, ktap, perf, ftrace, ...

stusmall · on July 24, 2014

I've even seen a Visual Studio plugin to debug the Linux kernel. No joke. Really looked to just be a layer over kgdb but still.

CUViper · on July 24, 2014

I suppose that's sysprogs.com/VisualKernel/ ? I'm amazed that this exists, but more power to them! :)

stusmall · on July 24, 2014

That's the one. I'd be worried if there was more than one. Like you said, more power to them. Pretty cool it exists and honestly wouldn't mind trying it sometime.

callesgg · on July 24, 2014

This was cool i will definitely try to use the bisect command sometime.

mcardillo55 · on July 24, 2014

So... technically a kernel bug was fixed and revealed a bug in Docker.