In my other life, I submitted a paper this week. It's not a bad paper - it shows something new, but like too many papers being published today, it's incremental and generally forgettable. It's not something that will be read much in 10 years.
I love reading old papers. They are from a time when authors were under less pressure to produce by volume. They are consequently more theoretical, thoughtful and broad than most papers published today because the authors had the luxurious time to sit and think about the results, and place them in context.
As I've pointed out earlier, the competitive academic environment tends to foster bias in publications: when trying to distinguish oneself amongst the fray of other researchers, one looks for sexy and surprising results. So do the journals, who want to publish things that will get cited the most. And so do media outlets, vying for your attention.
Jonah Lehrer's new piece on the "decline effect" in the New Yorker almost gets it right. The decline effect, according to Lehrer, is the phenomenon of a scientific finding's effect size decreasing over time. Lehrer dances around the statistical explanations of the effect (regression to the mean, publication bias, selective reporting and significance fishing), and seems all-too-willing to dismiss these over a more "magical" and "handwave-y" explanation:
"This is largely because scientific research will always be shadowed by a force that can’t be curbed, only contained: sheer randomness"
But randomness (along with the sheer number of experiments being done) is the underlying basis of the other effects he wrote about and dismissed. The large number of scientists we have doing an even larger number of experiments is not unlike the proverbial monkeys randomly plunking keys on a typewriter. Eventually, some of these monkeys will produce some "interesting" results: "to be or not to be" or "Alas, poor Yorick!" However, it is unlikely that the same monkeys will produce similar astounding results in the future.
Like all analogies, this one is imperfect as I am not trying to imply that scientists are only shuffling through statistical randomness. What I am saying is that given publication standards of large, new, interesting and surprising results, it is very likely that any experiment meeting these standards is an outlier and that its effect size will regress to the mean. This cuts two ways: although some large effects will get smaller, some experiments that were shelved for having small effects will probably have larger effect sizes if repeated in the future.
This gets us back to my penchant for old papers. With more time, a researcher could do several replications of the study, and find the parameters under which the effects could be elicited. And often, these papers are from the pre-null-hypothesis significance testing days, so the effects tend to be larger as they need to be visually obvious from a graph. (A colleague once called this the JFO statistical test for "just f-ing obvious". It's a good standard) This standard guards against many of the statistical sins outlined by John Ioannidis.
This is also why advances in bibiometrics are going to be key for shaping science in the future. If we can formalize what makes a paper good, and what makes a scientist's work "good", then (hopefully) we can go about doing good, rather than voluminous, science.