OpenAI’s ChatGPT outperformed physicians in diagnosing patients’ medical conditions in a small randomized clinical trial. Conducted between November 29 and December 29, 2023, at three academic medical centers in the U.S., the study’s result was published in November 2024 in the peer-reviewed JAMA Network Open.
The focus study sought to answer whether the Large Language Model (LLM) AI could enhance the diagnostic reasoning performance of medical practitioners compared with conventional resources.
Fifty doctors were recruited to participate in the clinical trial—26 attending physicians and 24 resident physicians, all U.S.-trained and specializing in family medicine, internal medicine, and emergency medicine. The doctors were divided into two groups of 25 members. Each group was given 60 minutes to review up to 6 clinical vignettes or medical case reports. One group had access to generative AI chatbots, and the other had access to conventional online resources.
Although the findings revealed that AI offered no significant difference between doctors using chatbots and those with conventional resources, what “shocked” Dr. Adam Rodman of Beth Israel Deaconess Medical Center in Boston was that ChatGPT scored an average of 90 percent in its medical diagnosis. The doctors with access to ChatGPT scored 76 percent, two percentage points higher than the group using conventional resources, at 74 percent.
At first, the participants were not convinced of the diagnostic reasoning behind AI chatbots. “They didn’t listen to AI when AI told them things they didn’t agree with,” Dr. Rodman told the New York Times. He found out by looking more deeply at the data, including the message logs of ChatGPT and the doctors.
This study means that more research like this would allow the medical field to take advantage of the potential of AI in improving clinical diagnosis. Errors in medical diagnosis happen and could cause harm to patients. However, medical AI can be an effective tool to assist doctors as it is capable of human-like responses, solving complex problems, and clinical reasoning. It provides a detailed review of a patient’s medical history. At this stage, though, as the study suggests, it is best to require human participation rather than letting computers replace doctors.
See how ChatGPT compares with the popular Perplexity chatbot in our head-to-head review.
The post ChatGPT Proves Superior to Doctors in Disease Diagnosis by 16% appeared first on eWEEK.