One thing that seems difficult with bisect are these series of steps:
1. "Hey... this bug didn't use to happen. Where'd that get introduced?"
2. OK, this unit test will tell me when the bug is fixed.
3. Now lets automate bisect to tell me where this
test first failed even though I just wrote it.
How do you do that? I think as long as you DON'T commit your unit test, bisect will carry over to every commit check. But, you'll have to make sure you commit all your other changes or else they'll be carried around, too: Conflict-city.
Also, if you do commit your unit test (a "bad" habit of mine) I have no idea how I'm supposed to work with this. I end up copying the unit test by hand to each commit it tries to test.
In summary, it seems like bisect was built with manual testing in mind. I know you can automate running a shell script, but I don't write my automated tests in shell script.
EDIT: Keep in mind, not all projects are interpreted. Some are compiled.
I've successfully used "git bisect" in situations in which, as part of every test, I had to "git stash pop" some local changes, then do the test, then remember to "git stash save" them before continuing to bisect. (Alternatively, "git apply" the stash, then "git reset --hard" to blow away the changes.) That's how you can handle situations where test code has to be added to the program to demonstrate the problem, and that test code doesn't exist in old versions.
Yeah, not a bad method. And I suppose if I've already commited my unit test, I could use a "git reset --soft" to put the test in a stash. Next time I use bisect, I'll try to do this. Thanks
The whole idea of bisection is that you have an objective behavior that has changed. If the test (which is, by definition, the tool used to measure that objective behavior) is not constant across the range of changes, then you simply don't have a problem that can be solved by bisection. You're breaking rule one ("This bug" isn't a single behavior).
Obviously in practice you'd probably just use whatever the most recent version of the test is.
Really I think you're problem is that you're overspecifying rule 2. It's not about identifying a "unit test" per se, it's about identifying any specific process to detect the bug. If the tests in your tree work, great. If not, you may have to do a little development to make one that is automatable.
I think you misunderstood. The test is brand new. As in, you find a problem that nobody has identified before and want to see how/when it was introduced.
Easy way to do this is put your test external to the tree. Then, bisect as normal for git, and run your test.
This is probably not uncommon for kernel tests, as you often have a test which could be "does this random laptop resume from sleep correctly?" Not really easy to automate that. And doesn't live in your tree.
Right, my point was more that often everything will be working great for the main developer's laptop. I did not think it was uncommon to have someone come out of the weeds with a specific laptop that doesn't work. Or worse, it works if you sleep using the "sleep" button, but not if you close the lid.
So, yes, you may be able to automate it later, once you fully understand the specifics that cause it to not rewake. Possibly specific to how it went to sleep with the lid close. Until then, it is nice to be able to do ad hoc tests when necessary.
In the past, I've committed the unit test like you, branched, rebased the unit test back to some appropriate point in time, and then run git bisect. This avoids applying/resetting at each step, and allows for some trickiness like changing your test to deal with an interface that mutates over the range of commits you're testing. The hash of the errant commit will be different after rebasing of course, but it's easy enough to match it up with the one in the original branch.
I wish there were a guide or some official "this is how to manage your git repository along with your tests so bisect will work wonders" - IIRC, bisect gets tripped up by commits that won't build and also some merges... would be nice to know what exactly to do about those once they're in your history.
In cases like this I usually copy the relevant test file to somewhere outside the repo, and refer to that path directly in the bisect cmd. That way you don't have to worry about anything conflicting with the new test that you need for the bisect.
E.g. for a Rails app, I might end up using a command like `rspec ~/Desktop/my_new_test_spec.rb`.
You generally create the test in a new file that won't be affected by checking out older code. If you need to change some part of your build system or test system, have the test be a patch that can be applied, run the build, run the tests, and check the results.
As far as having other uncommitted changes; just stash those. "git stash" is a tool you should use very often for any time you have uncommitted work that you need to clear out before doing some other git operation.
Sounds challenging. If it's something that happened a lot, you could keep the test suite in a separate repo with API levels, such that tests need newer APIs go in a higher numbered folder (think database migrations in web frameworks) and create a simple run script.
The idea of bisect is that you only need to do log_2(n) steps to find a bug in "n" commits. Even if you have to do manual testing, it's still normally pretty fast, unless you have hundreds and hundreds of commits.
You're not considering the time it takes to manually test.
* How much time does it take to do a clean build?
* How much time does it take to do a deploy?
* How much time does it take to reproduce the issue manually?
Add those together and multiply them by the number of commits bisect makes you check. It can add up.
Absolutely, and I'm not claiming that bisect is useful for debugging every regression. But it's saved me a LOT of time on a number of occasions, and shouldn't be dismissed out of hand just because it relies on manual testing. Even if going through 6 or 7 bisect revisions takes a few hours, it can sometimes be completely worth it.
No: you would cherry pick the change needed to bring in the test code, run the test and decide whether it is "good" or "bad". Then before running "git bisect good/bad" you would eliminate the cherry-picked change with "git reset --hard HEAD^".
One tip for the module configuration: kbuild has "make localyesconfig" which looks at your currently-loaded modules and converts them to CONFIG_FOO=Y builtins.
It's a pity that while the bisect did point in the general direction of where the bug was in 3.15, it didn't actually find the right bug (the one it found had already been fixed and a different one introduced that led to the same errant behavior).
With something as complex as the kernel, I also wonder about automatic bisect being led astray by unrelated changes. The bisect script might fail for a different reason and zone in on the wrong commit.
If I'm not mistaken, it didn't find the right patch series. The article says the function changed in 3.15 and the underlying reason for his issue was different and not part of that series.
Good article. It might be worth considering doing git bisect ignore if a build fails. I suspect (not sure it might just halt) at the moment it would result in a git bisect bad so might give you a wrong change commit.
I downvoted for tone, not because ve's wrong. I'm reasonably confident that someone who can write this blog post, understands binary search. I wouldn't have downvoted a post that simply said, for example:
> Binary search over 15,000 commits would take about 14 tries to find the failed one.
although I don't think that's an important point to be making. (As erikb said, that's still a lot of commits, and that was only the first run.)
Building 14 kernels and checking if they contain the bug manually sounds to be a lot of work for me. In that regard I think his post might not be a reasonable criticism. I think that criticism and being wrong is everybody's right, though. Therefore I wouldn't down vote a post for something like this but others seem to disagree.
This is part of why the Linux kernel needs a kernel debugger. I've worked with other kernels where we can actually observe what is going on using tools (DTrace on Illumos), and problems like this become something that can be tackled directly, without having to go through all this nonsense.
That's the one. I'd be worried if there was more than one. Like you said, more power to them. Pretty cool it exists and honestly wouldn't mind trying it sometime.