Apr 13, 2023
Ensuring image integrity in scientific publishing is critical to maintaining trust and credibility. While intentional fraud occurs, most image-related issues stem from honest mistakes, such as duplications and mislabeling.
Carefully crafted written content, while integral to communicating science, will only take researchers so far when explaining scientific data. High quality microscopy images enable researchers to present figures in a visually-appealing and effective way. Despite their importance, when reviewing scientific papers, we spend less time and attention on images than on written content. Here we will explore how the scientific community can proactively improve image integrity checks to maintain credibility.
Scientific publishing is built on trust, so researchers, editors and publishers have an ethical obligation to ensure all data shared in published papers is valid. Extensive checks during the submission process are integral to maintaining the credibility of all parties involved.
To streamline the process, researchers use tools to proactively check their work, such as grammar checks or plagiarism checking software, which the editor can also use to validate their checks. However, the same can’t always be said for checking images, even though images such as microscopy images and western blots are not just included to illustrate a point — they contain valuable data that must be accurate.
Why do we need to talk about image integrity?
Retractions enable journals to remove significant incorrect information published that could negatively affect future research. If someone reports an issue with a figure in an article post-publication, the credibility of the authors, editors and publisher can come into question. Errors or issues will be investigated, which can damage the reputation of all involved and impact the journal’s credibility and impact factor, no matter the outcome.
There is a common misconception that any errors reported are evidence of scientific misconduct. As a result, when speaking about issues and retractions, we often hear professors say “it’ll never happen to me”. However, even the most careful labs can unintentionally include mistakes if their image integrity checks are not carried out effectively.
In fact, according to leading image data integrity analyst Jana Christopher, the percentage of manuscripts flagged for image-related problems ranges from 20 to 35 per cent [1]. She also found that the percentage of manuscripts that were then rescinded ranges from one to eight per cent, suggesting that not all issues found require retraction.
Fraud or mistake?
Image integrity issues can be distinguished between unintentional duplications and planned manipulations. The emergence of paper mills — profit-oriented and potentially illegal organisations that produce fabricated or manipulated manuscripts — and reports of purposeful manipulations is a worrying issue in the academic industry, with some publishers putting up defences to prevent publication of these issues.
The topic of scientific misconduct has gained attention over the last few years. In summer 2022, some of those in the scientific community claimed that one of the most important studies in the history of Alzheimer’s research may include disruptions and suspected manipulations of images [2]. In the 16 years since the publication of this study, based on its findings, hundreds of scientists and pharmaceutical companies around the world have devoted resources to developing drugs in a certain direction based on these papers — which is now under investigation and could turn out to be wrong.
While forms of deliberate deception and misconduct can occur, research suggests that intentional manipulation accounts for only a small portion of the image issues found in papers. This was evidenced during a trial that ran from January 2021 to May 2022, where the American Association of Cancer Research (AACR) used Proofig’s automated image integrity software to screen 1,367 papers accepted for publication[3]. Of those, 208 papers required author contact to clear up issues such as mistaken duplications, and only four papers were withdrawn. In almost all cases (204 cases), there was no evidence of intentional image manipulation. It was simply an honest mistake.
Therefore, increasing our understanding of the issues that occur and why and creating a clear distinction between mistakes and deliberate deception is integral to reducing retractions. Then the scientific publishing community should find an efficient way to analyse every manuscript before publication to maintain integrity.
How do image issues occur?
The statistics from Jana Christopher and other image integrity experts show that mistakes can happen to anyone. For example, if a principal investigator or senior researcher that believes “It’ll never happen to me” has a lower image duplication error rate than average, of maybe 10% and they published 30 manuscripts, there is still more than a 95% chance that at least one of their articles contain image duplications.
According to the International Journal of Cancer [4] the most frequently found issues in papers are duplicated panels arising from copy and paste, magnification errors in microscopic images, and inappropriate splicing of gel sections together. Duplication refers to any form of reusing the same image in different parts of the paper without outlining it. This can occur when an image is used the same way twice, or may have been altered, for example by changing the rotations, size or scale. The image could have also been flipped or cropped during duplication, or researchers may use two parts of the same specimen, but they overlap. Figure 1, for example, demonstrates an example of image duplication with the duplicated section unintentionally rotated and flipped when capturing different parts of the slide.

Researchers will take steps to reduce the risk of including these forms of errors, but it is not completely unavoidable, particularly during experimentation. It can be particularly difficult for researchers that convey their data using microscopy images to ensure that there are no overlaps. For example, when looking at a sample under the microscope, researchers may want an image of the entire specimen to use in research. Depending on the magnification, the researcher must move the microscope from left to right and up and down to document every section of the slide. Unfortunately, the microscope will not tell the researcher if there are overlaps when they capture images. Additionally, when the researcher changes the magnification, they may accidentally capture the same part of the sample twice.
Researchers are unlikely to use every sample collected in their paper, often because they capture thousands before choosing those that best convey the information. To keep track of their images, researchers will often label the files with descriptions of the sample. For example, if a scientist was conducting research on the efficacy of a treatment on a pancreas they may label a file with the name of the organ, magnification, the date, slide number, and if this sample is from before or after treatment.
Filling in this data every time an image is captured can be time consuming, so researchers may opt to copy the last name entry and just change a few of the details. While this enables researchers to file their images more quickly, it can lead to unintentional mistakes. Forgetting to change something or inputting an incorrect number could lead to file name duplications or errors. When the researcher comes to include images in their paper, they might not be aware of the error and unintentionally include these duplications.
Researchers often work on papers for multiple years and collaborate with different research groups. Unless the images are properly labelled and managed throughout the process, it can be difficult to distinguish between files, meaning there is no guarantee that the researcher, or someone else in the team, will not unintentionally include an image that was used earlier in the paper.
Proactivity is key
While the investigation and retraction process can enable publishers to remove mistakes and examples of fraud from journals and try to preserve credibility, by this point the damage is done. Firstly, investigations can take up to two years, putting pressure on the researchers and significantly reducing their chances of winning funding, conducting research, or publishing elsewhere. Then, the impact factor and credibility of the journal may be negatively impacted because it did not find the image issues in before publication. Additionally, while the published article with issues is under investigation, other research groups may try to continue the mistaken results, which can augment the damage and the confusion, wasting a lot of money and time. Therefore, no matter the result of the proceedings, researchers and publishers must work hard to rebuild their reputation.
Addressing the problem before submission and during editorial review could therefore give the industry more opportunities to catch mistakes, resolve issues with much less stress and maintain reputations.
Researchers, editors, publishers and any other parties involved can take steps to proactively check images before and during the submission process to prevent publishing mistakes. However, if they still choose to check images by eye, there is still a chance that image issues will go unnoticed. Automating image integrity could help researchers and editors create a much more efficient and reliable way of proactively reviewing papers.

Enter AI
Advances in artificial intelligence (AI) and computer vision have led to the development of valuable tools for scientists. Researchers and publishers can now use online tools to check their content for grammar, readability and plagiarism. Similarly, publishers and researchers can now use software to automate the image checking process.
AI software can automatically scan every image in a research paper, completing checks in one to two minutes. Proofig software, for example, checks each image against itself and the others in the paper, looking for and outlining any anomalies that might be caused by duplications or manipulations. The software will generate a report outlining the potential issues for review, such as Figure 2 where the software detects a potential duplication and rotation in two separate microscopy slides. With these tools, researchers can confidentially scan papers and check sub-images before submitting their work to a publication, enabling them to detect and resolve any issues before sharing the content publicly.
Additionally, during the review process, editors can use image integrity software to streamline their checks, enabling them to check more papers in less time without compromising on accuracy and negatively affecting the impact factor of the journal.
By choosing an AI software that is built for checking images in life sciences research, such as microscopy images, fluorescent activated cell sourcing (FACS) and western blots, as seen in figure three, both researchers and authors can quickly detect and analyse papers. The automated tool is built to detect all forms of image edits, including rotations, scale, flips, crops, full and partial overlaps, cloning and combinations of these.
Editors are already benefitting from using AI tools as part of their editorial processes to identify potential image reuse before publication. The AACR recently conducted a study to see how it could streamline image proofing using AI tools[5]. The study found that the tool effectively identified real duplications in 14 per cent of the 1367 papers it reviewed. Additionally, the team compared the time taken and the number of findings to review manuscripts manually versus using the AI software — finding that the mean time for analysis when using AI software was 4.4 minutes compared to eight minutes when completed manually. The team also found that using the software enabled the editors to double the indentification of suspected images when compared to an editor analysing a paper without the software.
As AI becomes more sophisticated, editors can streamline the image review process. However, these tools are not intended to replace the role of the editor — the review process will always require some form of human intuition. While machine learning can help editors and researchers streamline the process, AI is there to detect images, make comparisons and flag potential issues. The user must then investigate the flagged issues themselves, looking in more detail to understand if the image contains permissible amends, and whether it’s an honest mistake or a manipulation.

Collaboration is also key to improving image integrity checks. Current AI software has been developed and validated by integrity experts to ensure it can detect issues within the same paper and included images. However, developers are still working to develop tools that can find duplications across different papers. By building a database of images in most, if not all existing published papers and providing access and permissions for use, developers can train AI to compare between current and past papers, further improving the accuracy of the image review process.
Data accuracy is vital in scientific publishing, and images play an integral role in presenting this data to readers. Continuing to report potentially fraudulent content and investigating these claims will be crucial to maintaining credibility, but when we look deeper, this only represents a fraction of the overall issue. It is also important to increase awareness of how easily and frequently unintentional issues can occur. Then we must provide guidance and tools to help proactively solve these issues before submission or publication to ensure that the only data we publish is data we can trust.
References
[1] Jana Christopher (Image Data Integrity Analyst at FEBS Press): https://ukrio.org/research-integrity-resources/expert-interviews/jana-christopher-image-integrity-analyst/
[3] https://www.theregister.com/2022/09/12/academic_publishers_are_using_ai/
[4] https://onlinelibrary.wiley.com/pb-assets/assets/10970215/FAQ%20Image%20Integrity-1591612755797.pdf