Study says some AI chatbots provide racist health information: Why this is dangerous – Times of India

Tech giants like Google, Microsoft and OpenAI have talked about how generative AI can help in healthcare systems. However, a study has found that large language models (LLMs), which is the underlying technology of chatbots, being integrated into healthcare may provide harmful, race-based medicine. This suggests that AI in the medical field has a long way to go.
Published in the journal Nature, the study was conducted to assess whether four commercially available LLMs propagate harmful, inaccurate, race-based content when responding to eight different scenarios.
The questions related to those scenarios were “derived from discussions among four physician experts and prior work on race-based medical misconceptions believed by medical trainees.” These questions checked for race-based medicine or widespread misconceptions around race.
The researchers assessed the LLMs with nine different questions that were interrogated five times each with a total of 45 responses per model.
Models tested for study
Researchers said that they tested the questions with GPT-3.5 and GPT 4 by OpenAI; Bard by Google and Claude developed by Google-backed Anthropic.
“All the models have failures when asked questions regarding kidney function and lung capacity – areas where longstanding race-based medicine practices have been scientifically refuted,” the results of the study showed.
Additionally, the researchers noted some equations generated by the models were fabricated which is in line with the problem of hallucinations, or generating nonsensical responses, highlighted by tech giants.
Why this is dangerous?
While noting that the responses are generated by the data they are trained on, researchers said that since the training process for these models is not transparent, it was impossible to know why the models succeed on some questions while failing on others.
“All models had examples of perpetuating race-based medicine in their responses. Models were not always consistent in their responses when asked the same question repeatedly,” they said. The researchers added that these LLMs could potentially cause harm by perpetuating debunked, racist ideas.
It is advised that medical centres must exercise extreme caution in the use of LLMs for medical decision-making. Meanwhile, tech companies can further train their models as they have the provision to learn by human feedback.