"*anyone who wants to can easily replicate your experiments*" Replicate the expe...

jsprogrammer · on Jan 2, 2015

How about, verify the published code (!) even produces the published results?

blackkettle · on Jan 2, 2015

the stuff i work on is in the area of machine learning, so most published work involves one or more well-known data sets.

i would argue that the two are the same in this case.

the lxcs provide all the source code i write [plus of course the compiled version], all third-party libraries, and all scripts used to run and evaluate the experiments, and the data as well, where that is permitted.

it's still not perfect, but for my area, i honestly think it is the best, and most accountable way to do things that i have seen.

mcguire · on Jan 2, 2015

And hopefully one or more not-so-well-known, local data sets to check that the results are actually as claimed?

blackkettle · on Jan 3, 2015

well, the idea is that you should be able to run any data set you have, and get good results relative to other solutions. but that is an open question with any research.

the point of the docker/lxc aspect is to provide a simple working environment to facilitate replication and validation.

so in comparison to the status quo, which is basically 'write a paper, include some high level equations, and results', i think this is a step forward in a better direction.

Fomite · on Jan 2, 2015

+1 for this. There is so much more to repeatability beyond "When I click run, does it give me the same number again?"

Blahah · on Jan 2, 2015

if it's an entirely computational experiment, which is not uncommon, then 'replicate the experiments' is correct.

mcguire · on Jan 2, 2015

I tend to worry that an error in the code will be baked into the theory for generations.

I don't deal with much scientific code myself, but at one point I dealt with a proof-of-concept cryptographic library from a reasonably well-respected researcher. The code behaved correctly from the outside, but when I dug into it, deviated wildly from the published specification.

forgottenpass · on Jan 2, 2015

Recent Eurpoean economic policy was based on a paper that relied on an Excel formula error http://theconversation.com/economists-an-excel-error-and-the...

It only lasted a few years, but I find the idea of exiting long lasting research founded on bad code a to be very real possibility.