Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not sure it's a crisis. Nobody who understands research expects every study to replicate, so what is a reasonable standard? 40% having weaker effects isn't a surprise to me, a complete amateur, for whatever that's worth. The point of research is to push the into novel territories of knowledge; like new tech at startups, that's necessarily going to include many failures.

If it's a critique of the scientific method(s), I'm all for very serious work on improving it. But I think the question is, how? Does someone know something that works better that they are holding back from us? Has any method in the history of humanity produced better results (or made more sense)? Should all science stop until it's perfect? What should we do going forward?

EDIT: I meant to add: Maybe social sciences are just really hard. They are a bit less deterministic than many aspects of physics, for example, where gravity is so predictable that a small perturbation thousands of light years away can tell us what's happening there.



>Nobody who understands research expects every study to replicate

I guess I don't understand research then, because it seems reasonable to me to expect most studies to replicate. Why not?


Well, for one, because the standard for publication in many fields is “there’s a less-than-5% chance that we observed these effects because of coincidence”.

Combine that with people not publishing negative-results studies, and all it takes is someone doing 20 studies (or worse, 20 analyses on the same data set) in order to find that 1-in-20 chance occurring...by chance.

Of course, this has little to do with research itself and much more to do with the standards researchers hold each other to.


While you're right about the 5%-or-less chance, it's still a 5% chance we're aiming for. If we end up with 40% over a large number of papers, something went wrong.

Not publishing the negative results shouldn't affect this number. They're not included in the 5% chance in the first place.


This is a common misunderstanding of p-values.

There is a 5% chance of observing the effect, if the effect is not there.

I think the difference is best illustrated by an example:

If you have 20 researchers investigating a hypothesis that turns out to be false 1 of them will report evidence for a false hypothesis, which will likely not replicate. Thus, if for every true hypotheses investigated, 20 false hypotheses are investigated, and the 19 researchers with false hypotheses do not report their result, that means that 1/2 of the reported positive results will not replicate.


That's a cyclical argument though to what the parent asked though.

"Research is OK to have ac40% non-replication rate" because "the standard for publication in many fields is low"


What is not being considered here at all is that different sciences are different.

First, you have purely theoretical sciences. "Math" (even though quite a bit of these disciplines are actually not quite in math, but in physics, economics, philosophy or computer science. There's others). Obviously they're not just replicate-able, they would never get published if they weren't. Furthermore replicate-ability is absolute, because it's theoretical.

Then you have positivist sciences. Essentially Physics and Chemistry. These are sciences where you can actually experiment, and therefore you can have bounds (like the 5/6 sigma bound in Physics). Replicate-ability is not absolutely guaranteed like in Maths, but it's going to work.

Then you have statistical sciences, like medicine, climate science, experimental economics where due to practical limitations (need actual people that might die as a result of the study, we don't have many test-planets, ...) the number of examples is extremely limited, and in general, far more limited than the complexity of the system would require according to actual statistics.

So replicate-ability is going to take a further BIG hit.

And then you have the sciences where we are studying the existence of a phenomenon in the first place. Biology, social sciences, psychology, political sciences, language studies, parts of experimental economics. In these sciences something gets studied because it exists. We study Siberian tigers, because they exist. We read Hamlet because it exists. Replication ? What do you even mean ? Replicate-ability is essentially zero, and everything is just based on judgement of famous characters.

Add to that that quite a few of these sciences (social sciences, political sciences, English) are up-front biased. They start from the point that they want to find/push X. This can be a value system, a way of thinking, or even spelling or grammar. Needless to say, these studies can be criticized from a neutral perspective.

But critically, thinking about it, you will realize: it would not make sense for social studies or English studies to be neutral.

The further you move down the line, the less the demands of the field on replication are, and the more problems replicate-ability has.

And yes, by the time you hit social studies, the norm is far past the point where a physicist would be fired for scientific fraud.

That does not make social sciences useless, it just means that people's evaluation of scientific results needs to go deeper than "oh scientist said X" (or even "consensus is Y").


> something gets studied because it exists

That's an interesting way to look at it

> Replicate-ability is essentially zero, and everything is just based on judgement of famous characters.

That doesn't fit what we see even in the linked article and the older story about psychology replication: All these experiments had methods that could be repeated (that's what the replication project people did), the effects of almost all the experiments were replicated, and for about half the research the effects were as strong as before. That's a long way from methods that are "just ... judgement of famous character" and "essentially zero" replication.

> quite a few of these sciences (social sciences, political sciences, English) are up-front biased. They start from the point that they want to find/push X

What is that based on? Is there someone in those fields that has written about it? Have you done any work in them?

Having some minor, intimate familiarity with these fields, though from an outside perspective, I don't agree. Political science, for example, aims to identify political phenomena and how they function. Every experiment in any science starts with a hypothesis, the theory the experimenter sets out to prove, and generally experimenters have a career based on promoting a certain model or idea. That is true of mathematicians and chemists too.

Also, I wouldn't group English, one of the humanities, with the social sciences.


The entire premise behind so called scientific experimentation is reproducibility. It is what allowed humans to move past superstitions, demonstration of consistent causes and effects.

This is why economics, psychology, etc. are called "soft" sciences. This absolutely is a crisis, because by definition ALL of these experiments are supposed to be replicatable and reproduced before they can scientifically be taken as truths of any certainty, and we're seeing that many modern fields, particularly medical, are operating based on potentially dangerously misleading experimental results.

And, since this is HN, possibly even maliciously misleading.


The crisis, then, is that people take results as truth before they've been replicated.

The more reasonable expectation is not that the results will be successfully replicated, but that the experiment can be replicated. In other words: That there is enough detail that someone else can try to confirm the results.

The problem is we've come to rely on first publication of results far too much. A first publication of a result should be seen as "here's an interesting result; someone please confirm." Not as "here are some new facts for you."


I’m no academic but I’ve read a few papers specifically because I wanted to implement what they talk about. None have been useful for this purpose. I don’t know how you can possibly replicate them if there is insufficient detail on the “apparatus”.


Or maybe, human psychology is too complicated with too many factors influencing results to be easily reproducible. The crisis is then more in the expectation you have and about initial small scale experiments being treated as "truth" and expected to work across cultures, social groups and social situations. Apple always falls down, but how people respond is sensitive to a lot of context.

Some of crisis are frauds or shaddy science (prison experiment). Some of it is the simplistic expectation that since experiments in basic physics are simple, psychology and sociology should be equally simple.

Lastly, these science are call soft, because they use less math and deal with fuzzy issues.


I think the critique and crisis stems more from research that presents optimistic, or potentially embellished results, rather than failed experiments, or experiments with small sample sizes. There seems to be pressure to generate “novel” research that presents profound results rather than admitting an experiment yielded nothing. Admitting nothing came from a study would prevent more people from wasting hours, days and years of time trying to replicate or study something that isn’t true or doesn’t seem to work.


A lot of resources are poured into science and what science recommends (sometimes). Given the stakes, I’d prefer a track record that’s more than slightly better than flipping a coin.


> I’d prefer a track record that’s more than slightly better ...

So would I, but do you know of a better alternative? What if this is the best option we have for now?

> slightly better than flipping a coin

Also, to be clear, that doesn't accurately depict the results of the replication study. Almost every replication produced the same effects as the original experiment; in about half the replications, the effects were weaker. It's not that half the experiments were wrong; they were less conclusive.


Ah, that is better at least.


> Maybe social sciences are just really hard.

More likely is that they're not actually science.


Sounds snarky, but perhaps this is true in that Economics cannot be science in the Popper sense since our experiments are usually too noisy to lead to reasonably falsify any hypothesis.

Still, I don’t think we’ve come upon a better method of knowledge acquisition in this realm.


I think there are profound limits to what can be understood from the application of natural scientific methods to the social world, because it's an open and non-linear system in which the units - people, groups - reflect on and learn from their experience. Also, while in the natural world you can easily create operational codings of variables because they follow deductively from proven theories (e.g. temperature), you cannot do that for the social world. There is no single and authoritative concept of 'power', for instance.

All that said, there is no one scientific model. Evolutionary biology has never pretended to by anything like physics. It's an open and non-linear system. It cannot make predictions, and nearly every generalisation it makes is either relatively trivial, or has a great many exceptions. For the most part, it's a historical science.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: