Health Technologies

Study exposes the dangers for patients using AI-driven search engines

Patients shouldn’t rely on AI powered search engines and chatbots for accurate and safe information on drugs, researchers have warned.

In a new study published by the BMJ, researchers found a considerable number of answers on drug treatments were wrong or potentially harmful.

Furthermore, the complexity of the answers provided might make it difficult for patients to fully understand them without a degree level education, add the researchers.

In February last year, search engines underwent a significant shift thanks to the introduction of AI-powered chatbots, offering the promise of enhanced search results, comprehensive answers, and a new type of interactive experience, explain the researchers.

While these chatbots can be trained on extensive datasets from the entire internet, enabling them to converse on any topic, including healthcare-related queries, they are also capable of generating disinformation and nonsensical or harmful content, they add.

Previous studies looking at the implications of these chatbots have primarily focused on the perspective of healthcare professionals rather than that of patients.

To address this, the researchers explored the readability, completeness, and accuracy of chatbot answers for queries on the top 50 most frequently prescribed drugs in the US in 2020, using Bing copilot, a search engine with AI-powered chatbot features.

To simulate patients consulting chatbots for drug information, the researchers reviewed research databases and consulted with a clinical pharmacist and doctors with expertise in pharmacology to identify the medication questions that patients most frequently ask their healthcare professionals.

The chatbot was asked 10 questions for each of the 50 drugs, generating 500 answers in total. The questions covered what the drug was used for, how it worked, instructions for use, common side effects, and contraindications.

Readability of the answers provided by the chatbot was assessed by calculating the Flesch Reading Ease Score which estimates the educational level required to understand a particular text.

Text that scores between 0 and 30 is considered very difficult to read, necessitating degree level education. At the other end of the scale, a score of 91–100 means the text is very easy to read and appropriate for 11 year-olds.

To assess the completeness and accuracy of chatbot answers,responses were compared with the drug information provided by a peer-reviewed and up-to-date drug information website for both healthcare professionals and patients (drugs.com)

Current scientific consensus, and likelihood and extent of possible harm if the patient followed the chatbot’s recommendations, were assessed by seven experts in medication safety, using a subset of 20 chatbot answers displaying low accuracy or completeness, or a potential risk to patient safety.

The Agency for Healthcare Research and Quality (AHRQ) harm scales were used to rate patient safety events and the likelihood of possible harm was estimated by the experts in accordance with a validated framework.

The overall  average Flesch Reading Ease Score was just over 37, indicating that degree level education would be required of the reader. Even the highest readability of chatbot answers still required an educational level of high (secondary) school.

Overall, the highest average completeness of chatbot answers was 100%, with an average of 77 per cent. Five of the 10 questions were answered with the highest completeness, while question 3 (What do I have to consider when taking the drug?) was answered with the lowest average completeness of only 23 per cent.

Chatbot statements didn’t match the reference data in 126 of 484 (26 per cent) answers, and were fully inconsistent in 16 of 484 (just over three per cent).

Evaluation of the subset of 20 answers revealed that only 54 per cent were rated as aligning with scientific consensus. And 39 per cent contradicted the scientific consensus, while there was no established scientific consensus for the remaining six per cent.

Possible harm resulting from a patient following the chatbot’s advice was rated as highly likely in three per cent and moderately likely in 29 per cent of these answers. And a third (34 per cent) were judged as either unlikely or not at all likely to result in harm, if followed.

But irrespective of the likelihood of possible harm, 42 per cent of these chatbot answers were considered to lead to moderate or mild harm, and 22 per cent to death or severe harm. Around a third (36 per cent) were considered to lead to no harm.

The researchers acknowledge that their study didn’t draw on real patient experiences and that prompts in different languages or from different countries may affect the quality of chatbot answers.

“In this cross-sectional study, we observed that search engines with an AI-powered chatbot produced overall complete and accurate answers to patient questions,” they write.

“However, chatbot answers were largely difficult to read and answers repeatedly lacked information or showed inaccuracies, possibly threatening patient and medication safety,” they add.

A major drawback was the chatbot’s inability to understand the underlying intent of a patient question, they suggest.

“Despite their potential, it is still crucial for patients to consult their healthcare professionals, as chatbots may not always generate error-free information.

“Caution is advised in recommending AI-powered search engines until citation engines with higher accuracy rates are available,” they conclude.

Avatar

admin

About Author

You may also like

Health Technologies

Accelerating Strategies Around Internet of Medical Things Devices

  • December 22, 2022
IoMT Device Integration with the Electronic Health Record Is Growing By their nature, IoMT devices are integrated into healthcare organizations’
Health Technologies

3 Health Tech Trends to Watch in 2023

Highmark Health also uses network access control technology to ensure computers are registered and allowed to join the network. The