Scientific fraud: the new scheme

to'The image of the researcher working alone, ignoring the scientific community, is a myth. Research depends on constant exchange, first of all, to understand the work of others and, second, to communicate its results. Therefore, reading and writing articles published in scientific journals or conferences is the core activity of researchers.

When writing an essay, it is essential to cite the work of your colleagues, whether to describe context, detail your sources of inspiration, or even explain differences in methods and results. Therefore, citation by other researchers, when it is for “good reasons,” is one measure of the importance of one's findings. But what happens when this citation system is manipulated? Our latest study Reveals a malicious method to artificially inflate citation counts: “hidden references.”

The underside of manipulation

The world of scholarly publishing and its performance, as well as its potential drawbacks and their causes, are recurring topics in popular science. However, let us focus in particular on a new type of drift that affects citations between scholarly articles, supposedly to reflect the intellectual contributions and influences of the cited article on the cited article.

Scientific citations are based on a uniform reference system: authors explicitly state in the text of their article, at a minimum, the title of the cited article, the name of its authors, the year of publication, and the name of the journal. Or conference, page numbers, etc. This information appears in the reference list of the article (Bibliography) and is recorded in the form of additional data (not visible in the text of the article) called metadata, especially when a DOI is set (Digital object identifier), a unique identifier for each scientific publication.

Scientific publication references allow authors, in a simple way, to justify their methodological choices or retrieve the results of previous studies. The references included in every scientific article are in fact the clear manifestation of the iterative and collaborative aspect of science. However, it is clear that some unscrupulous parties added additional references, not visible in the text, but present in the article's metadata while it was registered by the publishers. a result ? The citation numbers of some researchers or journals explode for no good reason, because these references are not present in the articles that are supposed to be cited.

A new type of fraud and opportunistic discovery

It all started thanks to Guillaume Cabanac who published a book Post-deployment evaluation report On PubPeer, a site where scholars discuss and analyze publications. He noticed a contradiction: an article, perhaps fraudulent, because it presents “tortured expressions”, from a scientific journal published by the publisher Hindawi, received many more citations than downloads, which is highly unusual. This publication attracted the attention of many “scientific investigations”. The reaction team was formed with Lonnie Besançon, Guillaume Cabanac, Cyril Labé and Alexandre Magazinov.

We try to find through a scientific search engine articles that cite the primary article, but Google Scholar does not provide any results while others (Crossref, Dimensions) find some results. It turns out, in fact, that Google Scholar and Crossref or Dimensions do not use the same process to retrieve citations: Google Scholar uses the actual text of the scientific article while Crossref or Dimensions uses the article metadata provided by the publishing houses.

To understand the extent of the manipulation, we next examined three scholarly journals that appeared to highly cite Hindawi's article. Here's our three-step approach:

  • We first explicitly list existing references in the HTML or PDF versions of articles;

  • We then compare these lists with the metadata recorded by Crossref, an agency that assigns DOIs and their metadata. We discover that some additional references have been added here, but they do not appear in the articles;

  • Finally, we investigated a third source, Dimensions, a bibliometrics platform that uses Crossref metadata to count citations. Here again we see contradictions.

The result ? In these three journals, at least 9% of recorded references were “ghost references”. These additional references do not appear in the articles, but only in the metadata, skewing the citation count and giving an unfair advantage to some authors. Some references that are actually present in articles are also “lost” in the metadata.

Implications and potential solutions

Why is this discovery important? The number of citations greatly influences research funding, academic promotions, and institutional rankings. They are used differently depending on institutions and countries, but they always play a role in this type of decision.

Manipulating citations can therefore lead to unfairness and making decisions based on faulty data. What is most disturbing is that this discovery raises questions about… Integrity of scientific impact measurement systems, which has been in the spotlight for several years already. Indeed, many researchers have pointed out in the past that these measures can be manipulated, but more importantly, they have generated unhealthy competition among researchers, who may therefore be tempted to take shortcuts to publish more quickly or obtain better results. Therefore, it will be cited more. The most dramatic consequence of these measures of researcher productivity lies above all in… Waste of effort And based on Scientific resources Because of the competition created by these measures.

To combat this practice, the Invisible College, an informal group of scientific investigators to which our team contributes, recommends several measures:

  • Strict metadata verification by publishers and agencies like Crossref.

  • Independent audits to ensure data reliability.

  • Increase transparency in the management of references and citations.

This study highlights the importance of the accuracy and integrity of metadata, as it is also subject to manipulation. It is also important to note that Crossref and Dimensions have confirmed the study's findings and the publishing house appears to have made some corrections that tampered with the metadata entrusted to Crossref and, by side effect, to bibliometrics platforms like Dimensions. While waiting for corrective action, which is sometimes Too long, or even non-existentThis discovery reminds us of the need for continued vigilance in the academic world.

