Let’s do the ballistics quickly.
Claudine Gay was the President of Harvard until two days ago. She resigned under public pressure subsequent to her congressional testimony about antisemitism on US campuses a month ago, and a separate and distinct fuss about plagiarism in her work starting in mid-December, outlined here and here.
I have no insight into the issue of college antisemitism that anyone else would benefit from. But if we’re going to have a big plagiarism fight, well, let me get my brass knuckles.
Why? Because I am a forensic metascientist.
Under the ten-dollar words, that means I study scientific error by analyzing published scientific documents and data. As you might expect, studying errors crosses over into the area of research integrity, which I have thought about and published on consistently for about a decade.
Scientific screwups are really interesting, because they allow you to make inferences - sometimes rich inferences! - about the evidence-gathering process. Can I trust this research? Is it slipshod, badly analyzed, or just fake? If it’s fake, how was it invented? If it’s real-but-terrible, why?
Lots of other people do this work, too, and we form a loosely-affiliated global group of people who largely know each other, and largely aren’t paid to do it. No university has a Department of How Other People Got Things Wrong, or a Institute For Screwups. No government confers proper funding. We do it, in our own time and at the risk of people hating us, because it needs to be done.
The agenda for doing the work is very straightforward: it is to make science better, and not squander what the remains of the Enlightenment on fools and bastards. It is to hold both the modern university and scientific publishing systems to account, which is always a good time as both prefer that their dirty laundry remain unwashed.
Importantly, there is no broader enmity towards any other group. Frankly, forensic metascientists do not have the resources to be specific. We would make terrible idealogues. It would require an order of magnitude more time, money, and people to start analyzing scientific output according to broad pre-defined criteria.
(For instance, if we were interested in investigating ‘Chinese academics working in the US’, like the Department of Justice was a few years ago.)
To do that, we would be starting from scratch for each individual paper or researcher. We do not have this luxury, and usually act on the basis of information received - that is, another scientist finds a problem in a published paper during the course of their own work, and brings it to us. Or, in the course of doing other things, we stumble over something that be worth looking into with a variety of computational and statistical techniques.
(Which we also build. This is why it’s time consuming.)
In other words, no-one is ‘targeted’, and the whole thing is completely stochastic. The work we take an interest in is rarely more defined than ‘whoever shows up’. Without institutional support, money, and broader scientific/public interest, that is the limit of what can be done.
The people whose work is analyzed, though, rarely see it that way. If you listen to them, they are always - predictably - the victim of trolls, wasters, and sociopaths who chose them specifically. I have a little list of the common reasons:
People from non-English speaking or less developed countries claim to be victimized because you’re using unnamed powers of cultural imperialism to make them the little guy. Never mind if they are a powerful tenured professor, and you are a graduate student, you are a mean neo-Colonialist.
Non-White people are victimized because racism. Never mind who the critics are, or what they believe.
Women are victimized because sexism, and this somehow applies to both male and female forensic metascientists.
Successful old white men, of course, are being victimized because you’re just jealous (in truth, this is the most common complaint, the idea that someone reading their research and proving it is compromised is somehow lesser). This one is interesting because the power imbalance is the other way around - rather than ‘these people are punching down’, the vibe is very much ‘these nasty little peasants are ganging up on me due to their inherent resentment’.
… and none of these people are above hoisting a flag around ‘mobbing’ or ‘internet trolls’, if a lot of researchers are simultaneously discussing a litany of their mistakes on the internet.
As you might have figured out by now, this is all horseshit. We have an overwhelming interest in bad science, vanishingly little in bad scientists. The latter make good newspaper stories, and the attention is useful to draw attention to the wider problems in scientific publishing, but to the work of doing science-about-science, the author of bad research is almost immaterial. I analyze most papers without googling the authors, because it doesn’t matter. Country, university, personal characteristics, I don’t care. You might as well tell me the author’s shoe size, it’s equally relevant to the quality of their work.
So what would happen if you threw this ethos out?
What if, instead of this disinterested version, we used research integrity tools as part of broader political point-scoring?
My thesis here is: it would suck, and it would have the potential to be very, very dangerous.
And this brings us neatly to the Claudine Gay debacle.
The Claudine Gay Debacle
Setting a Baseline
I have no love for Harvard as an institution.
This is a reasonably common opinion to hold in Boston, where the likelihood of meeting someone who has worked there in any given crowded bar approaches 1.
They are legendarily unpleasant to their academic staff: their usual MO is to hire a horde of pre-tenure assistant professors, crab-bucket as much work out of them as possible, and then shitcan most of them while expecting them to be grateful. Their justification for this is that you should be good and goddamn lucky they’d talk to you in the first place, and now you’ve worked for them, you can work anywhere.
(Actually, at that point, most researchers are happy to do so.)
Their administration is similarly unforgiving in other matters. Reportedly, they are even worse to their custodial staff, which is why they keep striking.
At the same time, they have many rusted-on faculty members who in a rational world would be wiping sticky fingerprints off the walls at a Starbucks. I will not name them, because what I really want to say about them is litigable. It’s easier to name research integrity cases where gigantic screwups happened, the ones you might have already heard of: Marc Hauser, Piero Anversa, Lee Rubin, Francesca Gino.
All of these cases involved a combination of institutional inaction and mismanagement.
[Yes, even in the recent Gino case. As much as I see massive, glaring problems in Gino’s work (and her coauthors, but that’s a story for another day) there are still legitimate questions to be asked about their handling of her case, and I strongly suspect that the university handled the zealous prosecution of her about as well as they handled the zealous minimization and hand-waving present in the other cases. Due process exists for everyone.]
They also get entirely too much attention in the debates about higher education and research culture more broadly, oxygen thieves in a high-nitrogen environment. They are hierarchical, exceptionalist, and arrogant.
Now, I hear the hooting loons of Twitter dot com ask, to this litany of sins… are they also woke?
My best answer is not really.
From my previous:
Find the most painfully progressive institution that comes to mind. The kind the American right likes you haul over the coals because they once put a puppy in a tent on exam day to cheer their students up, and called it Emotional Support Safe Space Therapy Dog, which is marketing, and not Look At The Puppy, It’ll Be Fun And It Only Cost $200, which is what it is.
Leaf through the background of their most senior management. You will find far more Marks and Spencer than Karl Marx.
This is why you need to be very discerning when people shit on about The Incredible Wokeosity Of The Modern University. ‘The university is a left-wing institution! good lord, all the diversities! the cancels!’
This is sometimes true - there are people being occasionally screamed off a podium - but (a) fewer than you think, and (b) in any immediate macro sense the phenomenon is a distraction at best. It is occurring at the same time these institutions are nearly 50 years into a program of brutal workforce casualisation and bean-counting, balanced against a carefully maintained structural oversupply of academics, and while universities are fighting staff industrial action to a standstill.
(If you want to read the whole thing, it’s here.)
This situation often results in a glaring contradiction for me personally.
In the newspapers and other content middens, you’d think these institutions were somewhere between a commune and the Zimmerwald Conference, where Lenin stood up and proclaimed that the central problem with the Great War wasn’t all the dead people, but that it wasn’t socialist enough.
But my knowledge of and dealings with large private universities who think they are God’s gift to knowledge are completely different. As a general experience, I’d say it combines the inefficiency and process-driven frustrations of government with the high tolerance for fuckery and sheer meanness of the modern corporation.
Now.
Yes, two things can be simultaneously true.
But I think it’s more true to say that a fancy modern university is a big neoliberal institution wearing a rainbow flag pin. And, like a lot of institutions, they are hyperconscious about their ‘brand’, and so they take their positioning on social and identity issues very seriously.
It’s silly, really. Harvard has a 50 billion dollar tax-advantaged endowment, is designed primarily to educate the Tim Nice-But-Dim children of its donors, and hoovers up whoever else tests well and might look good on the brochures.
All this is to say:
(a) I don’t have much of a dog in this hunt, and Harvard looking bad certainly doesn’t bother me. I don’t think they’re evil, and I know (and like!) a lot of people who work there, but on an institutional level I don’t feel the slightest affront if they get punched in the mouth.
(b) I think claims that universities are woke bastions of infinite woke-osity confuses their marketing and rhetorical games with their structural purpose, and it’s quite silly to make that claim about Harvard in particular.
The Allegations
When a story like this drops about someone who isn’t the President of Harvard, generally metascientists help each other out - you check my work, and later I’ll check yours. Often, we wall ourself off from the precise nature of problems to see if someone else can reproduce the same problems.
… I’m pretty sure that didn’t happen in this case.
I’m not going to bury the lede here: these allegations are a shitshow, and as a professional accusation-maker, they offend me with their lack of professionalism.
To be clear, there are instances of what I read as plagiarism here, but not many. There are also examples which very clearly are not, and also some marginal examples. Throwing all the good, bad and indifferent examples together into a big bag presumably to be able to claim ‘50 COUNTS OF PLAGIARISM!’ seriously undermines the credibility of the accuser.
(Note: not the individual accusations. Those stand on their own merits.)
This is not how you bring a plagiarism charge if you want to be taken seriously. The evidence as it was provided is deliberately polemical, and it’s a ball-ache to have to take it seriously, which we do.
But we’ll get to what it all means later. Let’s start with the mechanics. I’ve chosen a two cases each that are (a) silly, (b) that I think misunderstand academic practice, and (c) are a lot less silly. I’m not doing all 40, I’d lose the will to live.
Some Silly Examples
This can only be described as wildly unproblematic, and falls under the heading of ‘there are only so many ways to describe the same phenomenon’. I’ve chosen it first because it forces us to start by defining the space between paraphrasing and patchwriting, and their relationship to plagiarism.
(And patchwriting, by far the most obscure technical term of the three, is quite relevant to the public discussion around this mess.)
All definitions in this area are quite clear, it’s just their application that sucks. Paraphrasing is expressing an idea written or said elsewhere in what is clearly your own voice. Plagiarism is the duplication of material written or said elsewhere without attribution and with the intention to deceive. Between them lies patchwriting: “restating a phrase, clause, or one or more sentences while staying close to the language or syntax of the source”.
Patchwriting is generally treated as a misuse of source material but not as plagiarism. As in, it’s not as bad but you’re still not supposed to do it. There has been a multi-decade fuss about how this ‘lesser crime’ should be treated in higher education. What can make a big difference to any given example is having a citation present.
And in this case, as it says above, the article has no citations but Suggested Readings. So that makes it bad, right?
But the reason it doesn’t have citations is because it isn’t a research document.
It’s from Origins: Current Events in Historical Perspectives, “a free, on-line magazine that includes articles, podcasts, short “milestones,” and Top Ten lists produced by some of America’s leading historians … designed for use in high school and college classes in American history, world history, current events, and contemporary politics.”
NONE of the equivalent articles have citations. Here’s the 1993 document by Dr. Gay. And for comparison, here’s the last thing they published, a charming historical retrospective on electric cars in the early 20th century. No citations, source material at the end, likely almost entirely a paraphrased summary of other works.
It’s like complaining a Scientific American article doesn’t have research citations, or The Atlantic, or the CBS news.
These are the freaking acknowledgements, man!
Maybe it’s weird to copy someone else’s heartfelt messages to ‘Sandy’, but this isn’t plagiarism. This is like criticizing someone for copying their wedding vows - seriously, no-one cares, and presumably the sentiments still apply.
Not Silly But Misunderstanding Academic Practice Examples
In general, claims about academic plagiarism get very thin when we are discussing the methods section of an academic document. They get especially thin when we discuss statistical approaches!
The flexibility of language usually allows infinite capacity to synthesize new examples of any given idea. But in describing technical tasks precisely, this flexibility dies. If we have a variable A and a variable B, and they have an interaction term C, and we expect C to have a precise kind of covariance structure which informs D, you will actively fuck ability to describe it if you paraphrase it.
Doing this badly is pretty small potatoes. As a demonstration, I tried it myself: I read A&S 2006 above and then summarized it from memory. That looks like this:
Theoretically, an interaction should be present between fund transfer and state legislature; a state under Democratic control will transfer more money to more Democratic counties, and presumably symmetrically, a state under Republican control will transfer more money to more Republican counties.
Now, maybe I missed some nuance there, but the similarity feels high. I think our friend here, who as we can tell is quite motivated to include even low-level and borderline accusations, would ding me for plagiarism too.
This is very, very clearly attributed and isn’t plagiarism. It is duplicated text, of course, but we have a separate category of minor academic naughtiness for something like this, which we call ‘inappropriate citation’. On a scale of 1 to 10 of dishonesty, it gets about a -4. So, yes, by some field-specific standards it needs quotation marks. I would use them myself for the second section from p.383 because it’s a self-contained quote. But we are really fiddling at the margins here, gang.
Markedly less stupid.
OK, I’m uncomfortable with this.
It’s uncited and minimally re-written. That’s all that needs saying, really.
This is boilerplate-y text, but it’s also trending out of patchwriting and into duplicated for me. It’s … somewhat borderline. That much is arguable. But it isn’t egregious or silly to include it as an example of duplicated text, not at all.
A Summary
For some reason, I have written this as a dialogue.
Alright, Mr. Research Dickhead. If someone brought this case to you, would you work on it as a research integrity case?
That’s DOCTOR Research Dickhead to you. And, no, I wouldn’t work on it. The sheer amount of examples (borderline or otherwise) suggests some fairly slapdash citation practices, but in any context this isn’t something I could get excited about. If I had to pick a single word, it would be ‘sloppy’.
But you said there plagiarism here?
Yes, some. Less than was represented, more than zero. And you always need to add 'so far to a statement like that.
So far?
You bet. One of the ironclad rules of research integrity investigation is that there’s always one more example. Smoke is a great fire signal. I don’t know how much of Gay’s bibliography these angry internet men have looked at, there’s no internal investigative documents published, which is something a proper forensic metascientist always tries to do. So, if this is only half the papers, what’s in the other half? Maybe some of the thousands of people screaming about this would like to spend their time finding out instead?
Oh. So it warrants further investigation, eh? So you’re saying it’s serious?
Not really. If this was a regular academic and not the President of Harvard, a committee would sit on it for six months (four of those would be spent trying to find a common meeting date) and then conclude that it was definitely bad practice and also definitely not worth their time with regards to setting up a formal sanction.
If you think this is bad, you cannot imagine the sheer depths of burning tomfuckery possible in academia. I have seen so, so much worse.
You want to see what duplicated text looks like?
Now, THAT’S duplicated text.
Old Man Gloom dug this one out on the Sternberg investigation - guy wrote a book chapter that was almost entirely made verbatim out of six other papers and book chapters. There are levels to this game.
And you would struggle to believe the kind of mendacity and awfulness that other people can persist through in similar roles. If there’s a difference here, it’s the absolutely colossal and disproportionate amount of public scrutiny, not the ‘crime’.
Oh, really. Great. Making excuses. Another woke-ademic wokily defending one of his own.
I’ll stack my record at interrogating the work of powerful researchers against anyone else on the planet. I’m not going to try to convince you. Google me.
Alright, then, isn’t it different if she’s the President of Harvard? Don’t we expect more from such people?
Presumably, yes. This role has a strong symbolic component, and we expect purity from our symbols, as tiresome as that occasionally might be. The President of Stanford resigned just six months ago on the back of research integrity issues, but that was for something we knew was MUCH worse than this.
Symbolic roles come with symbolic falls. You’ve seen it happen to a million politicians, and if you’ve been paying attention to the VroniPlag project…
The what?
The giant European project that’s been investigating plagiarism in the work of hundreds of politicians for more than a decade! Do you think plagiarism detection was just discovered??
Oh.
Well. What happens in those cases is often similar to this - the story starts feeding on itself, and the institution can’t contain the arseache. So, they ask you nicely to resign, and solve everyone’s problem (including yours)… then they pay out your contract when no-one’s looking, and you zip off to The Seychelles for a while with the free money. It’s just in those cases the institution is often parliament.
Well, as an American on the internet, I didn’t know that other countries exist. Oh well. But hey, a paid holiday is not a bad deal for a diversity hire, eh?
Yeah, I’ve seen people complaining about that [note: a lot, Twitter sucked before and it sucks more now] and frankly I think it’s racist. You weren’t on the hiring committee, you don’t know what her qualifications are, you don’t know who else they were considering, you in fact don’t know anything at all about the mechanics of how a black woman got this job. You’ve just made a race-based assumption. What do we call those again?
That was hostile. Does this make you triiiiiiiiiiiiiggered?
Nah. Well, not the case itself, which is marginally interesting. But I’m really, really annoyed at how it happened, and how it’s playing out.
Speaking of which…
Why This Bothers Me So Much
I went on holiday in early December, and ignored the world for a while. I got a suntan and drank wine and bitched at my family. Standard Australian Christmas.
What brought me immediately back to earth was the absolute hurricane of nonsense around this.
So forgive me if this next part is … mildly dyspeptic.
Supporting the politicization of research integrity work is extremely dangerous.
The moment you make research integrity accusations in service of something other than research integrity, you shit in the well of goodwill that slowly builds up and allows public discussion of these issues to thrive. As I covered in the introduction, people who have their dirty academic laundry aired in public ALWAYS claim their critics have some horrible agenda, and we literally never do. As this is quite clearly the case, it means metascientists usually get to mount strong criticism in public and have it assessed on its merits.
But in this case, there is a stated, outward, open agenda. I’m not getting into the details of it, I don’t care. Just the fact that there’s a stated agenda at all has, understandably, made many people cynical. I have seen people who I know personally to have an excellent and nuanced understanding of how research integrity works convinced this whole case is nothing at all without looking at the evidence simply because it was publicized by flint-hearted idealogues, and boosted in part by racists and overconfident Reply Guys.
And yes, as we’ve established, this case is much thinner than it presents itself, but you can’t pretend that means nothing, and you especially can’t pretend that it means nothing by definition. You’re giving them what they want, which is the ability to play hypocrisy wedge games.
It’s been a really interesting last twelve months for research integrity, and it feels like there’s dramatically more public and media interest then there used to me. People are starting to organise differently, think differently. I feel a wellspring of energy around the subject.
So right now, I am boiling at the arse at the idea that a heuristic of ‘research integrity case means culture war shit, safe to ignore’ might ever be established. This is too important to leave it to the ruiners to play their games with.
Please, in cases like this: actually read the specifics of any allegation like this. Even if they’re clawed straight out of Satan’s steaming guts. It’s too important to just cheerlead. People you don’t like are capable of being right, even if they’re dishonest and tiresome, even if they’re only minimally right.
Don’t read the headlines, and definitely don’t read the tweets. Please look past whatever agenda is on the table, and look at the evidence.
Research integrity is too important to become another culture war football.
Thank you for this refreshing statement! I love the term "forensic metascientist", sounds much better than what the press calls me: plagiarism hunter. And thanks for mentioning VroniPlag Wiki. However, the group does not only document plagiarism in politician's dissertations. Of the 218 currently published documentations only 20 are from people considered politicians. More than double (58!) are or were academics: Professors, researchers, teachers. An overview of the cases (with links to the forensic research results) can be found here: https://vroniplag.fandom.com/de/wiki/%C3%9Cbersicht
Indeed, I find some of the findings to date in Gay's thesis serious and others not so much. The situation where a reference is given but the quotation marks are "forgotten" is called a "pawn sacrifice" in Germany. Often the text overlap (one of my euphemisms for the p-word) continues on past the reference mark.
Referencing is not rocket science: Mark where the text or ideas from others begins, where it ends, and where it comes from. So either directly quote: "...." (Snafu 2024) or indirectly paraphrase with Snafu notes that ..... (2024, p. 42). Snafu continues to decry .... (2024, p. 43). And so on.
Looks like the problem of politicizing research integrity work is already happening: https://www.businessinsider.com/bill-ackman-wife-neri-oxman-mit-dissertation-plagiarism-2024-1?amp
The link points to an article alleging plagiarism by the wife of one of the Harvard donors who uncovered Gay's problems. Obviously, the implication is that the donor is being hypocritical (which he may be). Yet in the middle of the article, it says, "Similarly, in most of the other instances BI identified in which Oxman lifted passages from other works, she cited the author but did not put quotation marks around the plagiarized material," which is more the inappropriate citation that you described, rather than plagiarism proper. 🤦♂️