ChatGPT may be better than doctors at evidence-based management of clinical depression without gender or social class bias, new research finds.
ChatGPT may be better than a doctor at following recognised treatment standards for clinical depression, finds research published in the open access journal Family Medicine and Community Health.
The study also found that the AI language model eradicates the gender or social class biases sometimes seen in primary care doctor-patient relationships.
The first port of call for most patients suffering from depression is their primary care doctor. The recommended course of treatment should largely be guided by evidence-based clinical guidelines, which usually suggest a tiered approach to care, in line with the severity of the depression.
ChatGPT has the potential to offer fast, objective, data-derived insights that can supplement traditional diagnostic methods as well as providing confidentiality and anonymity, say the researchers.
They therefore wanted to find out how the technology evaluated the recommended therapeutic approach for mild and severe major depression and whether this was influenced by gender or social class biases, when compared with 1249 French primary care doctors (73% women).
They drew on carefully designed and previously validated vignettes, centering around patients with symptoms of sadness, sleep problems, and loss of appetite during the preceding 3 weeks and a diagnosis of mild to moderate depression.
Eight versions of these vignettes were developed with different variations of patient characteristics, such as gender, social class, and depression severity. Each vignette was repeated 10 times for ChatGPT versions 3.5 and 4.
For each of the 8 vignettes, ChatGPT was asked: ‘What do you think a primary care physician should suggest in this situation?’ The possible responses were: watchful waiting; referral for psychotherapy; prescribed drugs (for depression/anxiety/sleep problems); referral for psychotherapy plus prescribed drugs; none of these.
Only just over 4 per cent of family doctors exclusively recommended referral for psychotherapy for mild cases in line with clinical guidance, compared with ChatGPT-3.5 and ChatGPT-4, which selected this option in 95 per cent and 97.5 per cent of cases, respectively.
Most of the medical practitioners proposed either drug treatment exclusively (48 per cent) or psychotherapy plus prescribed drugs (32.5 per cent).
In severe cases, most of the doctors recommended psychotherapy plus prescribed drugs (44.5 per cent). ChatGPT proposed this more frequently than the doctors (72 per cent ChatGPT 3.5; 100 per cent ChatGPT 4 in line with clinical guidelines). Four out of 10 of the doctors proposed prescribed drugs exclusively, which neither ChatGPT version recommended.
When medication was recommended, the AI and human participants were asked to specify which types of drugs they would prescribe.
The doctors recommended a combination of antidepressants and anti-anxiety drugs and sleeping pills in 67.5 per cent of cases, exclusive use of antidepressants in 18 per cents, and exclusive use of anti-anxiety and sleeping pills in 14 per cent.
ChatGPT was more likely than the doctors to recommend antidepressants exclusively: 74 per cent, version 3.5; and 68 per cent, version 4.
The AI model also suggested using a combination of antidepressants and anti-anxiety drugs and sleeping pills more frequently than did the doctors.
But unlike the findings of previously published research, ChatGPT didn’t exhibit any gender or social class biases in its recommended treatment.
The researchers acknowledge that the study was limited to iterations of ChatGPT-3 and ChatGPT-4 at specific points in time and that the ChatGPT data were compared with data from a representative sample of primary care doctors from France, so might not be more widely applicable.
Lastly, the cases described in the vignettes were for an initial visit due to a complaint of depression, so didn’t represent ongoing treatment of the disease or other variables that the doctor would know about the patient.
“ChatGPT-4 demonstrated greater precision in adjusting treatment to comply with clinical guidelines. Furthermore, no discernible biases related to gender and [socioeconomic status] were detected in the ChatGPT systems,” highlight the researchers.
Further research is needed into how well this technology might manage severe cases as well as potential risks and ethical issues arising from its use, say the researchers.
There are also ethical issues to consider, particularly around ensuring data privacy and security which are vital considering the sensitive nature of mental health data, they point out, adding that AI shouldn’t ever be a substitute for human clinical judgement in the diagnosis or treatment of depression.
Nevertheless, the researchers conclude: “The study suggests that ChatGPT […] has the potential to enhance decision making in primary healthcare.”
“However, it underlines the need for ongoing research to verify the dependability of its suggestions. Implementing such AI systems could bolster the quality and impartiality of mental health services.”