Biomedical visualization specialists haven’t come to terms with how or whether to use generative AI tools when creating images for health and science applications. But there’s an urgent need to develop guidelines and best practices because incorrect illustrations of anatomy and related subject matter could cause harm in clinical settings or as online misinformation.
Researchers from the University of Bergen in Norway, the University of Toronto in Canada, and Harvard University in the US make that point in a paper titled, “‘It looks sexy but it’s wrong.’ Tensions in creativity and accuracy using GenAI for biomedical visualization,” scheduled to be presented at IEEE’s Vis 2025 conference in November.
In their paper, authors Roxanne Ziman, Shehryar Saharan, Gaël McGill, and Laura Garrison present various illustrations created by OpenAI’s GPT-4o or DALL-E 3 alongside versions created by visualization experts.
Screenshot from paper.
Top row: Incorrect GPT-4o or DALL-E 3 images; Bottom row: images created by BioVisMed illustrators – Click to enlarge
Some of the examples cited diverge from reality in subtle ways. Others, like “the infamously well-endowed rat” in a now-retracted article published in Frontiers in Cell and Developmental Biology, would be difficult to mistake for anything but fantasy.
Either way, imagery created by generative AI may look nice but isn’t necessarily accurate, the authors say.
“In light of GPT-4o Image Generation’s public release at the time of this writing, visuals produced by GenAI often look polished and professional enough to be mistaken for reliable sources of information,” the authors state in their paper.
“This illusion of accuracy can lead people to make important decisions based on fundamentally flawed representations, from a patient without such knowledge or training inundated with seemingly accurate AI-generated ‘slop,’ to an experienced clinician who makes consequential decisions about human life based on visuals or code generated by a model that cannot guarantee 100 percent accuracy.”
Show me a pancreas, and MidJourney is like, here is your pile of alien eggs!
Co-author Ziman, a PhD fellow in visualization research at the University of Bergen, told The Register in an email, “While I’ve not yet come across real-world examples where AI-generated images have directly resulted in harmful health-related outcomes, one interview participant shared with us this case involving an AI-based risk-scoring system to detect fraud and wrongfully accused (primarily foreign parents) of childcare benefits fraud in the Netherlands.
“With AI-generated images, the more pervasive issue is the use of inaccurate imagery in medical and health-related publications, and scientific research publications broadly. While the potential harm isn’t immediately apparent, the increased use of inaccurate images like this, and problems like reinforcing stereotypes in healthcare, to communicate health and medical information is troubling.”
Ziman said that the larger problem, echoed in a series of interviews discussed in the paper, is the way inaccurate imagery affects how the public sees scientific research. She pointed at the “well-endowed rat” and how it was featured on The Late Show with Stephen Colbert.
“Satirical criticism by such public figures (that people may tend to trust more than ‘legitimate’ news sources) can throw into question the legitimacy of the scientific research community at large, and the public can come to distrust (even more) or not take seriously what they hear coming out of the scientific research community,” said Ziman.
“Think of the consequences then for public health communications as during COVID, vaccine campaigns, etc. And bad actors now have greater ease of quickly creating and sharing misleading but convincing-looking imagery.”
Ziman said while AI-generated medical images often get shared in the biomedical visualization (BioMedVis) community for a laugh and criticism, practitioners have yet to figure out how to mitigate the risks.
- Caught a vibe that this coding trend might cause problems
- As AI becomes more popular, concerns grow over its effect on mental health
- ServiceNow eyes $100M in AI-powered headcount savings
- Microsoft CEO feels weighed down by job cuts
Toward that end, the authors surveyed 17 BioMedVis professionals to assess how they see generative AI tools and how they use those tools in their work. The survey respondents, referred to by pseudonyms in the paper, reported a wide range of views about generative AI. The authors grouped them into five personas: Enthusiastic Adopters, Curious Adapters, Curious Optimists, Cautious Optimists, and Skeptical Avoiders.
Some of the respondents appreciated the abstract and otherworldly aesthetics of images generated by AI models, saying the images helped advance conversations with clients. Others (about half) were critical of GenAI style, agreeing with “Frank,” who said the generic look in those images is boring.
Irrelevant or hallucinated references remain a problem, as do invented new terms, such as the ‘green glowing protein.’
Th
