This is part one of a two-part series exploring the potential opportunities and risks of using ChatGPT and large language models in healthcare, this blog will focus on the clinical perspective while the second blog will look closer at the technology and considerations for using in a healthcare setting.
For nearly a decade now, I have been focused on making routinely collected data in healthcare more powerful, actionable and usable. Given that 80% of these data reside in unstructured format – the majority of my focus has been on using natural language processing (NLP) to transform these data.
Over this time, many trends and companies have come and gone, as NLP has matured from “niche” to “must have”. Conversations with Healthcare organizations have moved from “what is it?” to “what methods would you use?”.
NLP has been a dynamic and evolving field, but nothing has disrupted the industry like the arrival of ChatGPT. For the first time ever, social media feeds are flooded with NLP, across industries. My friends, who for the most part have struggled to understand what I do (it was much easier when I was on the wards in hospital!) are sending me articles and LinkedIn posts on the subject, highlighting just how far the word has spread.
A lot of interesting potential and exciting use cases pop up, almost daily. Given my background and healthcare interest, and the social media algorithms – most of these use cases are in healthcare. So, what is real, tractable and revolutionary and what is noise?
In this blog, I offer my perspective – first of all, as a clinician, and then as a specialist in applying NLP to medical records.
A clinical perspective to ChatGPT
First and foremost, much of the buzz around chatGPT is its ability to seem “human”. In the 10 years I trained and practiced as a doctor – one thing that was central to good clinical practice was the importance of mutual trust and respect between doctor and patient. For the foreseeable future – that is something that can only be fostered from human-to-human interaction. The bond a clinician makes with the person they are caring for is the key that unlocks much of the patient history.
Three use cases where ChatGPT could support clinicians in their work:
Augmenting administrative document creation.
Clinician burnout is a real threat to the healthcare system in most countries, with a large amount of this burnout caused by non-clinical tasks such as writing claims appeals letters to insurers. Any application of AI to overcome clinician burnout is undoubtedly a good thing. With some carefully selected prompts to ChatGPT about the patient’s condition, doctors can proof read a fully authored letter, saving them significant time. There are already examples of this happening - although don’t believe those references! (see below))
Improving patient access to healthcare information.
Chatbots have grown in popularity in recent years, used across healthcare as the first step in managing patient questions about their health plan, or their condition. The added sophistication that ChatGPT offers over traditional chatbots make this an area of significant potential.
Improving clinical documentation.
This is certainly a more nuanced and challenging task – but from playing around – ChatGPT does a very good job of creating realistic patient summaries. If clinicians can input key findings elicited from patient history and examinations, and ask chatGPT to create the summary of the encounter, there is the opportunity for the AI to remind the clinician of other relevant questions to ask that may increase the specificity of a diagnosis. This smart documentation assistant could revolutionize clinical documentation improvement activities, which are a vital part of the revenue cycle at providers, ensuring they stay the right side of the balance books. This would also have an added impact of reducing administrative burden of claims denials – as documentation and supporting evidence would be present more often.
The downsides of ChatGPT for healthcare
Misinformation and accuracy
Ask ChatGPT just about anything – and it will always answer with 2 characteristics: articulacy and conviction. And from what I, and many others, have seen – ChatGPT gives a perfectly articulated wrong answer with just as much conviction as it gives the right one. This is worrying in any industry – and quite terrifying in healthcare. The bizarre blend of excitement and fear that I am feeling about ChatGPT was very nicely summed up by Dr Faust in this article. In his experiments with ChatGPT – the AI fabricated a causal relationship between the oral contraceptive pill and costochondritis, a common cause of chest pain. When asked for evidence – it then fabricated a publication, using a credible journal and real authors, to justify its information. The fabricated article was a perfect and convincing lie and as good an illustration as I have seen that shows we are nowhere near the panacea yet. ChatGPT has highlighted the huge importance of expert review when AI is used to automate previously human tasks. How human review and NLP are currently being used in best practice will be covered in my next blog in this two-part series.
Potential for bias
With ChatGPT trained on “the internet” – it is inherently subject to the biases that pervade this data source. There is finally a huge push in healthcare towards health equity – that is the notion that everyone has a fair and just opportunity to attain their highest level of health. Unfortunately, the current biases that exist in medical literature, the study of disease and the documentation of best practice permeate the internet. Therefore, when ChatGPT is asked medical questions – its answers reflect this. The most prominent example of this, circulating on social media, is ChatGPTs response to defining a good scientist based on race and gender. The results are alarming. I must add that in trying to repeat these tests – I notice that there is now a response from chatGPT to say that predicting a person’s likelihood to be a good scientist based on race and gender is unethical – a much better answer. I tested this a little more, and asked chat GPT to write a script which used gender and ethnicity as predictors of renal function. I got the same unethical answer as above. In this instance, the filters put on are not helping. Ethnicity is an important risk factor in renal disease, and it is not unethical to consider it as an important risk factor when designing disease progression models – it is, in fact, unethical to do the opposite!
ChatGPT has not been trained on real patient data – therefore it is lacking in terms of real medical context. In order for the power of ChatGPT to be realized in healthcare- it will need to be trained on real healthcare data. This poses significant privacy concerns and the sharing and reuse of identifiable patient data. Appropriately deidentifying adequate volumes of free text medical data to train these models is not an insignificant undertaking. And with the way that ChatGPT generates its responses – it is very important that no protected health information is ever presented back to end users.
As you can see, while there is potential for ChatGPT to support clinicians to make their work more efficient there is still a long way to go for this technology to be a real game changer in the healthcare space.
In my next blog, I will look in more detail on using ChatGPT as an NLP engine along with the limitations and considerations you need to make with deploying NLP and AI in healthcare.
In the meantime, if you’d like to learn more about NLP in healthcare check out this introductory webinar or get in touch.