Who is at fault when people fake scientific research?
Presently, the best answer would be ‘any person who was found to be directly responsible for the specific fraudulent acts’.
That determination is made by a long and unreliable process, usually carried out by a university, that is usually secret. And it’s obvious that this doesn’t work.
One alternative is to make everyone involved at fault.
Yes, all of them.
Permit me to explain why, through the medium of cats.
Cats And How They Can Eat You
I love cats.
I have always loved cats.
I got a cat when I was five. I have had at least one cat for 35 years.
About the same time, when my parents made the decision to sponsor a zoo animal, I was very upset that they chose the crimson rosella, a pretty but not even slightly endangered Australian bird. I had just discovered caracals.
I have volunteered at a cat shelter for the last ten years.
My life plan is to make a bag, then cross-breed domestic cats until they are 25-30lbs, and capable of taking out low-flying aircraft. This is not a serious plan, except that it absolutely is.
I am just happier with a small bag of violence in the room.
Given this, and a substantially more Floridian upbringing, I am exactly the kind of silly gonad who would try to rescue a cougar, keep a cheetah in my house, or have a leopard in the back yard. I find predators sufficiently compelling that I could probably be tempted into doing something monumentally stupid with them.
So, let’s say I had a cougar in my backyard.
Because there is a world where that is possible, and it is this one.
(It is perfectly crazy that there are places where this is legal, and even crazier that you can own primates (which are far more dangerous). Every time a chimp bites off someone’s face, I’m amazed they didn’t do it sooner.)
If you were to break into my yard to try to steal my barbecue, Clarabelle (and she would have a name like that, maybe Nancy, or Shiva) could very straightforwardly wound, maim, or kill you. Papa didn’t raise no coward.
Laws are different everywhere. If you are on my property, you are likely guilty of criminal trespass. But within most instantiations of civil law, strict liability applies to me keeping 150lb of murder in my yard. This means if you cross my fence in a balaclava, with a pocket full of lockpicks, and a big cartoon bag with SWAG written on it, when Esmerelda takes your arm off at the elbow I am still responsible.
This applies equally. If the yard is fully of dummy training arms filled with boar meat, and I own mycougareatspeople.com, and I have catfished the burglar on NextDoor by complaining about my lax backyard security and posted my address, I am responsible. If there is a 20ft fence around the yard, and then a giant breed-appropriate cage, with a real lock on the outside, and a big sign in three languages that says DO NOT ENTER! LIBRA THE COUGAR WILL EAT YOU - then I am responsible.
Strict liability is a legal doctrine that applies in a wide variety of crimes and torts. It holds responsibility entirely separate to negligence, recklessness, or intent.
If my cougar chews part of you off, it’s on me, no matter what. As a legal principle it makes sense, as there is no way to make a cougar truly safe. If she was a regular dog, the legal mechanics would change, but a cougar is a huge fist filled with knives. If I wish to own it, I have to accept an expanded concept of responsibility for it.
Strict liability is a well-understood legal principle, and pops up in a number of places - if you sell a defective product that harms people, statutory rape, drug possession, ultrahazardous activities (like storing bulk explosives in your leaky garden shed), and, of course, in athletic drug testing.
This can be rolled back, but usually only if you can prove that (a) you didn’t take a banned substance knowingly, and (b) the positive test is not indicative of any performance benefit. Basically “Did you ‘take a contaminated supplement’ Mr. Armstrong? Well, prove it or you’re guilty.”
Anyway.
Should also apply strict liability to cases of scientific research fraud?
It Wasn’t Me, Yer Honour
Here’s how it would work:
You are part of a team which publishes a scientific paper
This paper is investigated for anomalies, usually by an outside party
The anomalies are investigated locally at the institution where the first and/or senior author works
The work is deemed fabricated, falsified, or plagiarized
Everyone is responsible, and are penalised
That’s it.
I fully expect this to horrify people, but we’ll get to that.
This Is So Much Easier
The assignment of responsibility for such a scenario is what makes the investigations so long, what ties everyone in knots, and what underlies the secrecy and tomfuckery of a university investigation. We feel compelled to assign culpability - the data is fake, but who faked it?
The level of responsibility, of course, varies greatly. If you are an author on a paper, it is entirely possible that you are hoodwinked by your co-authors, that you are essentially manipulated into a position where you append your name to a bad paper. It is also possible that you faked all the data yourself, and took advantage of everyone else.
At the end of an investigation, a investigatory committee may scratch about amateurishly through a few years of your old emails, then decide your culpability is somewhere between 0% and 100% inclusive.
I have previously mentioned that these investigations are typically slow, unnecessarily secretive, rule-bound, incompetent, cynical, and useless. Because they are.
However, this should not kill our sympathy for the investigators - they are essentially doing a complicated service task, ontop of their regular jobs, with no training, no compensation, and under pressure that ranges between substantial and tremendous. This is an awful job. Then, we massively compound the complexity of it by confronting them with shit academic record keeping, cases which are sometimes years or decades old, proprietary file formats, complicated scientific or data issues, unreliable or unavailable witnesses, incredibly generous internal deadlines which kill momentum, and then essentially ask them to read people’s minds. Even with good evidence to inspect, this is potentially a quagmire.
And when it is concluded, we ask them to make a determination without a well-grounded or formal system of culpability and assign a consequence.
Of course they weasel out of it and slap wrists! They wish for the process to stop, for everyone to get on with their lives. They find a penalty that will probably make the perp think ‘ooh, I got a good deal’, that’s sufficiently harsh-sounding that the university thinks ‘ooh, I can claim my fanciful ideals still apply and shut this shit down because it’s bad for business’, and that gets them the hell off the endless committees.
Strict liability circumvents this parade of milquetoastery. The determination that is made about a paper’s accuracy is empirical, and confined to ‘is there an error in process?’
This decision is usually much more straightforward than determining why the problems exist. Is it busted? Yes. In the forensic metascience community, we can often make this determination without access to the data or materials behind a paper. Open data often makes this easier. In a formal investigation, and wielding the the power to compel access to the necessary internal records, it is easier still.
Making the crux of the investigation who is responsible is so very much harder. As an example, I would immediately point to CUNY’s investigation of Hoau-Yan Wang, which has to be seen to be believed.
The committee has found evidence highly suggestive of deliberate scientific misconduct by Dr. Wang for 14 of the 31 allegations. However, we were unable to objectively assess the merit of the allegations due to the failure of Dr. Wang to provide underlying, original data or research records and the low quality of the published images that had to be examined in their place. For the majority of the publications identified in the allegations, we recommend that editorial action be taken to demand verifiable original data and determine, based on these, if the misconduct described in allegations has indeed taken place.
Finally, our investigation has revealed long-standing and egregious misconduct in data management and record keeping by Dr. Wang. It appears likely that no primary data and no research notebooks pertaining to the 31 allegations exist. It is for this reason that it was not possible for this committee to objectively determine how figures were created from the experiments described in the publications cited in the allegations. Dr. Wang has therefore failed to provide the data and research records necessary for the committee to directly address the concerns surrounding the published work under identified in the allegations. Thus, the integrity of Dr. Wang’s work remains highly questionable.
The full report is worth reading, because it is liberally saturated in the same impotent academic anger leaking through the paragraph above.
Dr. Wang was accused of some very serious and repeated charges of academic misconduct, but he managed to string investigators along for about a year and a half, frustrating them ay every turn - when they asked questions about data, he had lost the files, or had cleaned them out ‘because COVID’, or had set them on fire, or had lent them to his second cousin, or turned them into paper planes, or literally anything else except actually allow an investigation into the work he did or admit culpability.
This led to the investigators becoming deeply frustrated and epically pissed off. Because they were always interested in reconstructing the process of why.
One reason that bad researchers pursue this infinitely annoying strategy is that you can usually downgrade a charge of intentional research manipulation to one of bumbling incompetence by just not playing along. It is actually self-protective to be a bad scientist, or even to cosplay as one - the investigators end up accusing you of that instead.
People who are rolled up for research integrity issues may be dishonest and mean, but that does not make them stupid. They will delay, lie, obfuscate, time-waste, dither, deliberately misunderstand, and bumble.
And the more they see that tactic being successful in the wider world, downgrading perpetrators from devil to merely fool, the more it will be tried.
Basically, we’re getting played.
It is so, so much easier to not engage in this pantomime.
The Public Service Is The Reliability Of The Research
There are two broad reasons to investigate a serious research integrity problem.
(1) so the scientific record can be updated (and the update needed ranges variously between ‘of no consequence whatsoever’ to ‘massive and globally-relevant’)
(2) to meet the internal requirements of the local institution, establishing errors in process that need to be corrected, punishments, mitigation.
Basically, (1) is the problem as it exists for other people, and (2) as it exists for a local institution.
Obviously, it does not matter to scientists who merely wish to cite or read a paper why it is not accurate enough to exist. It just shouldn’t. The entire public service, the entire ‘correction of the record’, can begin and end with the problem.
Reduces the mockery of authorship
For any given research integrity problem caused by a single author, there are usually innocent bystanders.
On one hand, it is rational to feel sorry for them. Papers can be too complicated for one author to review in entirety, remits can be quite strict (‘you are handling experiments 4 and 5, and that’s all’), etc.
On another, it is a little comical to feel it outlandish to insist on personal responsibility for something that you make a strong personal contribution.
This tension is where we get the Quantum Author, who is simultaneously sufficiently important to be involved, a key member of the team, a necessary inclusion within the list of named authors…
… but if questioned in the aftermath of some terrible problem with the work, devolves into a peripheral bit-player who claims never to have read any of the words or have seen the data at all.
The Quantum Author ‘writes’, but has never written.
It would be very interesting to put some downwards pressure on gift authorship, the phenomenon of adding ‘non-contributing’ authors as a favour. This is an often bemoaned, demonstrably unethical, completely common, and almost undetectable grift. It is also self-generative. The more networks you have, more people under you, the more access you obtain to this sort of publication. It is yet another instantiation of the Matthew Effect.
But would you be inclined to so casually become involved in the ‘writing’ of a paper you never wrote if you were held responsible for its reliability? God forbid, in your authorship you may be forced to act like an author.
Higher standards
Academic scientists usually have a view of themselves as detail-driven empiricists. People who are serious. People who supply exhaustive detail.
This is not completely wrong, but can be far from true. Academics often supply detail relative to task. And if task is publishing, detail can be scant. There is no penalty for failing to establish full confidence in both the processes and outcomes of a set of experiments.
One look at the commercial world will let you know just how deficient academic culture can be in this regard. Pharma has a long history of being suspicious of academic hires because the people are sloppy, and their record keeping, data stewardship and respect for the repeatability of a process sucks. They can tolerate training new hires as working scientists, because they can be taught to use established processes, but have a strong tendency to hire ‘within industry’ for critical roles or any form of team management. The FDA, or any equivalent regulatory body, is much more exacting than peer review, and they play for table stakes!
Likewise, tech and biotech companies (and investors) have entrenched cynicisms around many research areas which are presented in the public eye as having ‘great promise’. They are often blanket skeptical of entire research groups, entire effects, or whole classes of experiments, nowhere moreso than when ‘famous’ professors become associated with them. Generally, they are shiny little boosters who have shown themselves to be dishonest. The public perception of a lot of ‘famous’ researchers is often the EXACT OPPOSITE when smart people are handing out VC, M&A, and strategic partnership money.
I do wonder if making authorship contain default responsibilities would improve that. ‘Making sure the paper isn’t made up and/or ridiculous’ should probably form part of any responsible due diligence.
We also need to bear in mind that substantial problems with published papers are often trouser-wettingly obvious, because they are so often detected by non-authors with a small amount of training - and usually ad hoc training at that, a combination of natural skepticism, good numerical abilities, and a few self-taught skills. How do we formalise this skepticism as a body of knowledge? Make people do it, especially authors.
Lower Consequences
(I have, in all of the above, ignored the elephant in the room. And in writing this, having started by annoying the defenders of the status quo, I might be capable of annoying the reformers of science as well. Never let it be said I’m not even handed.)
If you are an author on a paper that has retractable research integrity problems, and it is genuinely not your fault, it is completely fair to say that you suffer consequences in excess of your culpability.
So, as much as I am a card-carrying member of the hang-them-by-their-toes brigade, we might have to lower the punishments we consider. If we are to discover, investigate, and retract a lot more papers then we must inevitably hand out ‘reduced sentences’.
In fact, we may have to reconsider the fact there is a punitive model in the first place.
Perhaps it is enough to retract the work. It would only be a moderate amount of work to make an accessible, public, searchable database to identify any given researcher by their retractions. The easiest way for many of us would be simply having the ability to search it on PubMed. If we are playing a giant reputation game, maybe the actual metrics needed to form that reputation should be more accessible.
What would a different punitive model look like? I am not sure, but I keep returning to a comparison to the legal system. 98% of criminal cases are plea-bargained in the US legal system. There is neither the time nor the willingness to prosecute them in jury trials.
Does this come with systemic biases? Does it deliver fairness? Does it kind of suck? Yes, no, and yes, respectively.
But it is also inevitable. There is only so much court time, only so many juries, so many resources. So, we only get so much redress for any individual infraction, but with the benefit of it being applied more broadly.
If we are going to retract more papers, editors cannot be drowned in an expanding and rapidly accruing pool of retractions, which is what we will get. The process of investigating a paper is a ship that is often not launched by universities because we know that its voyage will be an arseache. It is easier to simply leave the ship in dry dock, conduct an investigation that concludes any problems ‘do not rise to the level of research misconduct’, and then claim that the whole decision is protected because it is a ‘personnel matter’.
Why I Wrote This
This piece started after I read the recent (and entirely too hagiographic) piece on Dan Ariely in Business Insider.
Some choice quotes:
Duke finally completed its investigation in January, Ariely told Business Insider. He said the university concluded that data from the honesty-pledge paper had been falsified but found no evidence that Ariely used fake data knowingly. Duke said its policy is to keep such investigations — even their existence — confidential, declining to comment further. While Ariely has resumed his work at Duke, he said he couldn't comment on specific disciplinary actions he may be facing.
This is wrong. The honesty-pledge paper had fictitious data added to it according to a rule. It would only be FALSIFIED if it worked from real data which was misused or distorted. In this case, data was added according to a simple decision rule. It was FABRICATED.
And I’m sure you can see the echoes of everything you’ve just read in ‘found no evidence that Ariely used fake data knowingly’.
But Ariely says that professors are judged too harshly for past errors and that academic institutions are too afraid of mistakes.
But this is not a ‘mistake’. A mistake is an accident. This is a deliberate deception. It is the difference between hitting someone when you swerved to avoid an accident, and going full Carmageddon into the local mall. Actually, the only thing we know for certain is that we were deceived.
To use the word ‘mistake’, you have to mean ‘I made a mistake in trusting either an unknown fabricating party and/or the data they appended to me.’
He hasn't personally opened data files in two decades because of his disability, he told BI, holding his stiff fingers up to the laptop's camera for inspection.
This is my least favourite part of the whole piece, and I find it hard to believe that a responsible journalist wouldn’t follow up on it.
Plenty of researchers with disabilities do not use them as a crutch to avoid having to perform basic scrutiny on their own work. I’m sure with a 35 person lab having someone else prepare basic data inspection, walkthrough, graphs, and descriptive statistics is not only completely possible but also traditionally how busy managers who are in charge of workgroups inspect the quality of the lab’s output!
When I was a CSO, I would often have 30 or 60 minutes to do due diligence on data we were collecting between other meetings - so, we developed a toolkit to inspect the relevant time series we were collecting, set expectations of what a review of it would look like, then we would graph it all and stare at it, analyze it, poke and prod, and I never touched a keyboard. Someone a lot younger than me who was much better at Python would drive!
(This was also part of training them to understand the signals of data fidelity and ingrain good habits so they could learn to be independent!)
This does not make me a hero, and was not hard, as it is the absolute bare bones of basic diligence.
Academia has also developed high-tech tools to root out misconduct. Any teacher or journalist can run a paper through one of the dozens of plagiarism detectors available online. And academics have developed statistical tools to pinpoint instances of data massaging, including practices that may have been standard a decade ago but are now frowned upon.
Again, inaccurate. Plagiarism detectors come in two forms: shit but free, and good but expensive. And statistical tools to pinpoint ‘data massaging’ are not somehow crystal balls that point to instances of academic cultural change, they generally point to data that IS IMPOSSIBLE. Trust me on that one, I invented some of them. It has never been within the academic ethos to publish data that isn’t possible.
During a February meeting, one professor asked the university's academic council why it wouldn't publicly discuss the Ariely investigation to "reassure the Duke community — and indeed the world — that Duke takes academic fraud seriously."
This, at least, is a bit funny. Duke takes academic fraud seriously, that’s why they paid Uncle Sam 112.5 million dollars for overlooking it. It’s just their version of taking it seriously differs entirely to ours.
But, my individual gripes with the details aside, the overall futility of yet another one of these ‘investigations’ struck me harder than usual. It just feels as if we have no simple system of assigning responsibility, no-one will ever assume any.
A Conclusion
Academia presently has a system where you can ‘author’ multiple fraudulent studies, and if those studies waste enormous amounts of money or kill people, this is completely fine - as long as you can prove that investigators can’t definitively prove that you were directly responsible.
I realise that’s a convoluted sentence, but it’s important. If there is uncertainty in the provenance of research misconduct, it is harnessed. And, scientific record-keeping being what it is - and what it used to be, which is even worse - there is always uncertainty.
This incredibly high bar for culpability means that effectively there may be no consequences for the publication of dangerous, irresponsible, or fraudulent work. And, as there probably should be, it feels entirely more ethical to make everyone involved directly responsible for what they contributed to.
And that brings me to my final questions: do you trust your co-authors? Under this new model, would you treat data from unknown sources differently? Would you refuse to accept ‘gift’ publications because you couldn’t give the work appropriate scrutiny?
You just might. And, just maybe, you should have been doing that all along.
james.claims is free today. But if you enjoyed this post, you can tell the rapidly aging author that his screeds are valuable by pledging a future subscription. You won't be charged unless he enables payments, which he recently figured out how to do. And it only took a year.
James:
This all reminds me of the detective-story gimmick where a crime is committed by a pair of identical twins. The idea is that even if there's clear evidence that one of the twins did it, it's impossible to prove which one did it, and so neither can be convicted of the crime.
How would you avoid discouraging reporting of fraud by a co-author, as in the Pruitt cases? It is already hard to ask a junior researcher to investigate and ultimately retract their own paper. If this also opened them to penalties it sure wouldn't make things easier.
One could say that all of Pruitt's co-authors should have scrutinized his data much more suspiciously, and this is true. But one can also say that people will fail to do this from time to time, and under your scheme, if they catch this failure later they are disincentivized (assuming there are penalties beyondretraction) from speaking up about it.
You hint that maybe there are no penalties beyond retraction. I don't know that I could stomach that in cases like, say Didier, where lives were literally at stake. I'm not sure I could stomach it in general. It's an awfully serious crime, research fraud.