
Hidden dangers in in search of medical advice from LLMs

Last 12 months, ChatGPT passed the US Medical Licensing Exam and was reported to be “more empathetic” than real doctors. ChatGPT currently has around 180 million users; if a mere 10% of them have asked ChatGPT a medical query, that’s already a population two times larger than Recent York City using ChatGPT like a health care provider. There’s an ongoing explosion of medical chatbot startups constructing thin wrappers around ChatGPT to dole out medical advice. But ChatGPT just isn’t a health care provider, and using ChatGPT for medical advice just isn’t only against OpenAI’s Usage policies, it may well be dangerous.
In this text, I discover 4 key problems with using existing general-purpose chatbots to reply patient-posed medical questions. I provide examples of every problem using real conversations with ChatGPT. I also explain why constructing a chatbot that may safely answer patient-posed questions is totally different than constructing a chatbot that may answer USMLE questions. Finally, I describe steps that everybody can take — patients, entrepreneurs, doctors, and corporations like OpenAI — to make chatbots medically safer.
Notes
For readability I take advantage of the term “ChatGPT,” however the article applies to all publicly available general-purpose large language models (LLMs), including ChatGPT, GPT-4, Llama2, Gemini, and others. A couple of LLMs specifically designed for medicine do exist, like Med-PaLM; this text just isn’t about those models. I’m focused here on on general-purpose chatbots because (a) they’ve essentially the most users; (b) they’re easy to access; and (c) many patients are already using them for medical advice.
Within the chats with ChatGPT, I provide verbatim quotes of ChatGPT’s response, with ellipses […] to point material that was ignored for brevity. I never ignored anything that might’ve modified my assessment of ChatGPT’s response. For completeness, the total chat transcripts are provided in a Word document attached to the top of this text. The words “Patient:” and “ChatGPT:” are dialogue tags and were added afterwards for clarity. These dialogue tags weren’t a part of the prompts or responses.