How To Stop The Next 10,000 Bullshit Papers
Some remarkably un-radical proposals
Introduction
My papers in forensic metascience are kind of boring. They are generally meant to establish narrow and usually simple conceptions of statistical or methodological malfeasance.
I think they are important, and that means I do not mind them being boring.
I do not react in the slightest when someone tells me ‘I tried to read it, but it all seemed a bit abstract and I don’t care’.
You are not required to care. I am not your mum — you will feel no inexplicable pull to love me, no matter what I said on your wedding day. I often tell people not to read my work. It will do nothing for either of us.
On the other hand, there are other papers about scientific nonsense that you really should read. Generally, they’re not in the forensic metascientific tradition (can I say we have one of those now?), and they take similar observations about how we find inconsistencies and problems within the scientific literature.
Then they do what I don’t like to do (unless I’m paid), which is ‘spend a great deal of time applying them to something important’.
This is about one of those papers.
10,000 papers about nothing
This paper by Matt Spick and pals is a scientometric analysis of the recent explosion of poor quality papers which draw on information from publicly accessible databases. How much is an explosion? Somewhere between 7000 and 11000 papers which all use free data resources and automated tools to gin up extra papers on topics that sound superficially plausible but are probably a waste of everyone’s time.
This is probably a new and growing paper mill product — rather than fabricating research from scratch, it seems out friends in the bullshit-for-money business have an increasing commercial focus on drawing spurious conclusions from real data.
Now, ‘spurious’ is a load-bearing adjective in that sentence. How are we to know if they are spurious without reading them all?
The answer isn’t very satisfying to me right now, and it’s ‘we’ve read enough of them to notice a strong patternicity in their problems’. I would prefer something more empirical, like:
ingesting all of them and finding some textual features that connect them
same, but evaluating their reference and author network for a similar signal
analyzing a much higher percentage of them for relevance and accuracy
You might have noticed that all of these things are hard and annoying.
In research integrity, this is when the work often stops— because very few people are paid to do it. You need engineers, lawyers, and money. But at least this is a tractable problem! There are many ways to analyze ten thousand papers.
But coming over the top rope like Macho Man Randy Savage, there is one more obvious factor: you can make any big data set say anything.
Analytical flexibility is a bit like a prescription opiate — people understand it, but they do not understand it, because they have never tried it.
Happily, you don’t have to pawn your jacket to try analytical flexibility. Try it sometime. It’s upsetting how easy it is. Start with the data and work outwards. Kitchen sink the bastard. Invent new cut-offs. Pick the wrong model. Throw it down the stairs, shake it off, and try it again. Imagine someone paid you when you were finished.
It’s so very, very possible.
So what are we going to do about it? And who is ‘we’?
How to stop the next 10,000 papers about nothing
There are two suggestions given above:
(1) some kind of access controls around these databases
(2) pre-registering analysis
These are the logical answers, but they’re both challenging for their own reasons.
(1) is fraught, because NHANES, BioBank, FAERS, etc. are open resources designed to be fully accessible to anyone for any reason (or in some cases, any justifiable reason if you’re a researcher etc.), and we generally regard that as a net good, regardless of what people do with them. The regulation that exists around these databases would need to change, and then we would also need to institute those access controls, which is work. We have switched from 10,000 papers to 10,000 evaluations of data requests probably by small teams of people who now bear the burden of oversight.
Watch that last sentence carefully, it’s about to have a friend.
(2) is probably the most obvious answer — make sure any analysis from a API-accessible database is first registered before any kind of access or analysis takes place, in the manner of a registered report.
A RR is peer reviewed twice, once before any work is done to evaluate the proposed analysis for meaningfulness, quality, etc. and once after the work is done to ensure that the proposed analysis was followed exactly.
We have switched from 10,000 papers to 10,000 registered reports, which is 20,000 individual peer reviews, probably by small teams of people who now bear the burden of oversight.
Yep, it’s that again.
Or let’s say we toned this down, and made the criteria for publication ‘any pre-registration’ instead of a Registered Report. That is, you make and publish a public analysis plan.
I don’t suggest pre-reg as a solution to a lot of problems, because it’s evolved into something where some people do it well, and most people do it not well enough for it to matter. Analytical flexibility can persist through even a fairly well-constrained pre-description of an analysis. And, of course, a lot of people don’t follow their pre-registration plan anyway, because reviewers don’t read them.
So, for this to be a solution, we have switched from 10,000 papers to 10,000 pre-registration documents which need to be read and incorporated into review.
You see the pattern at this point.
Every time you move a burden of oversight around, someone has to do it. When we propose any modification to a publication scheme, we should think very carefully about who receives that hot potato. Especially when some form of oversight is moved from ‘lots of different journals and peer review pools’ to ‘a small group of people with specific expertise who don’t currently bear the burden of it’.
Don’t get me wrong, neither of these are BAD ideas.
They’re just complicated, and they’re insufficiently radical for me, a grump who spends half the night drinking Cynar 70, listening to old Iniquity albums, and writing screeds like this.
But at least I have some alternative proposals. I have three, actually. One is obvious, and boring, and already being implemented. The others are more expensive and annoying, but they do have some advantages: they’re like to be a little more future-proof, and they don’t have scaling problems.
Here’s the easy one.
Make sure editors know to nuke these products on sight.
I’m sure it’s perfectly legitimate to produce new NHANES papers, but if they are included in the context of other work. In the meantime, train all your editors. “Naked NHANES paper? BOOM”
Very short loop.
And related:
Require that any analysis which is mining a public data resource with an accessible API is accompanied by other empirical observations
Self-explanatory. We have enough reviews and re-analysis in the world. Using public data entirely for yet another also-ran paper is a river run dry. Some castle in the sky built out of a dataset you can call in a few lines of code should be the start of an empirical process, not the end of one.
[IRRITATING UPDATE: I had this in draft from last week, and I hope you can believe me given the amount of goddamn text that has been written above, but events move faster than I can write. Turns out many publishing groups have done exactly this (without me ever having the opportunity to look clever about it, I’ll just be over here trying to convince myself now). From the article:
This is obviously a good thing and I am not surprised, because it is an obvious solution and it is free. Anyway, good decision. If I’d know, I wouldn’t have started this. But, now the hard bit.]
Improve the technical infrastructure behind these resources so getting an analysis from a free data resource is no more complicated than getting money out of an ATM
Leaving a big pile of partially structured mostly labeled data in public is, obviously, a great thing. With a bit more structure, you could have something really interesting — the ability to put a more structured analysis query into the data platforms, and retrieve an answer that answered a question straight from the source. This would involve a significant amount of back-end work but would reduce the idea of a nonsense paper to a single figure that took 60 seconds to retrieve from the source. Instantly unpublishable on its own, forever.
But this comes up against a persistent bias we see over every technical field: standards, assurances, reliability metrics, weights-and-measures stuff and its maintenance over time, is very boring and no-one wants to talk about it.
The same neophilia that goes into creating scientific work, where we rush forward towards the New and Shiny, goes into the attitude we have towards tools we build. This isn’t a great failing, it’s human nature. Ask anyone who’s made a resource public — the public show up, and they can be unreasonably demanding and annoying.
If something like this is built, it needs to be maintained. And that’s generally where we screw up, and foresight abandons us. Building persistent resources is hard.
What public scientific resources do you wish existed that are reasonably achievable? I think I have catalogues at this point, but I’d like to know other people’s. Leave a comment if it’s something that really lights a fire under your etceteras.




