Monday, June 1, 2015

Scientific fraud doesn't make sense to me

Having just read about an alleged data manipulation scandal in political science in the USA, I wonder once more why people are committing scientific fraud. I get plagiarism in the humanities, although of course the risk of getting caught is growing ever greater considering the ease with which text searches can be conducted these days. Still, I realise why people might be tempted into doing it, especially if they are not pursuing a career in scholarship, just want a few extra letters to appear on their business card, and are unwilling or unable to make the necessary effort. Inventing or manipulating empirical data as a career scientist, however, I don't really understand.

To me there seem to be four factors involved in somebody's decision whether to commit fraud: (1) ethics, (2) the rewards of getting away with it, (3) the likelihood of getting caught, and (4) the consequences of getting caught. Honest people will, of course, not commit fraud, so let's limit our considerations to the dishonest minority. Likely there are people out there in pretty much every profession who will behave badly if they think they can get away with it and if the rewards are sufficiently high, even if they are vastly outnumbered by honest colleagues.

That leaves factors 2, 3 and 4. I will now make another simplifying assumption, but one that I consider to be very realistic: if somebody is found to have fabricated data, their career in science is pretty much over. They can forget about getting serious grants or about being hired by a serious institution. In other words, the consequences of getting caught are always catastrophic. Factor 4 is thus invariant across all possible scenarios, and what really matters are factors 2 and 3, the reward of not getting caught and the likelihood of getting caught. What we are searching for is a scenario in which the former is high and the latter is low, because that would provide the incentive to commit fraud.

The thing is now that at least to me these two factors appear to be strongly positively correlated. On the one side, somebody might fabricate a spectacular result in an extremely important research field, such as "newly discovered substance cures cancer". That would make the rewards of getting away with it high - fame, big research grants, perhaps a high profile job. But on the other hand, these being spectacular results in an important research field, many other people will by definition try to reproduce the results, they will try to build on them. And then the whole house of cards comes crashing down. High potential reward means high risk, and in fact I would say near certainty, of discovery.

The alternative is to fabricate unspectacular results that people are unlikely to try and replicate, e.g. a technical report on the structure of a new substance that has no known uses. But then, what is the point of making stuff up if nobody is going to care? Low risk means low potential reward, but that also means that even a low likelihood of catastrophic consequences looms large over the minuscule benefits of cheating.

And this is what I don't get. Even assuming that somebody is crooked and has no qualms fabricating data, what do they hope to get out of it? Why do they think they can get away with it? If there is any significant advantage to what they are doing compared with doing honest science they will be caught sooner or later, and then they will fall much deeper than they have ever risen.

And the weirdest thing is, to even get away with it in the short term they must be highly proficient at scientific writing and at understanding study designs, because otherwise they could not make up convincing-looking results and publish them. So why not just do honest science if they have that capability?

Or am I just incapable of figuring out their true motivation? Perhaps everybody who does that just hopes that nobody will catch them...


  1. This is something that also continues to puzzled me. Although I know there are times when a result seems just so close that the temptation to fill in the blanks could be tempting, especially to those under pressure to perform, the elaborate lengths and almost certainty of discovery in many cases seems prohibitive.
    However, maybe for some it genuinely is a calculated risk: it seems as though even IF you are found out, the penalties are not predictable or necessarily career-ending:

    Research has also looked at the estimated cost of identifying, investigating, and prosecuting scientific fraud, and it is not insignificant. I could see some researchers reasoning that even IF they are suspected, the cost to the institution to take action may be prohibitive until their hand is somehow forced, and that they might get away with it.

    To me, the most pertinent question is whether incidences of fraud have increased as seems from anecdote evidence, or whether there is an observation bias. If the rate really is increasing this is a worrying trend, so why? Is it due to competitiveness and the growing stakes associated with immediate success in science; a result of better tools to identify fraud; or just an increase in number of cases in proportion with the increase in research and published work? I tend to lean towards the "pressure to produce" but have nothing to back this up, although there is a suggestion that multiple pressures, but especially to publish are indicated:

    Finally, while the idea that fraud is reprehensible, and it is disturbing to think that unreliable data may be undermining our own research, the toll that exposure has on the people involved and the way that these cases are handled often leaves much to be desired. There have been several instances recently of not only the perpetrator, but of even otherwise innocent associates of fraud taking their lives after being faced with exposure.
    The joys of being human, eh?

    1. You are right to point out the complexity of the issue and that my simplifying assumption on the consequences may not be correct. Still, I think that the basic idea - the correlation between risk and reward - makes sense.

      At the one end you have high reward but high risk because people will try to replicate, and at the other end I expect making up plausible looking data to be precisely as labour intensive as doing genuine research.

      Vicki told me you are going to the Smithsonian. If that is so, congrats!

    2. Yes, I start in DC Jan 2016. I will also be visiting Canberra over christmas and would like to bend your ear about the state of compositae phylogenetics if you have some time.

    3. No problem, we are not travelling this time. Send me an eMail!