Generative AI Diagnosis in Neurology Enhances Non-Specialist Accuracy

By João L. Carapinha

June 5, 2026

generative AI diagnosis

Generative AI diagnosis is stepping into neurology with striking results. In a new study, ChatGPT-4o reached 65.5% accuracy for the correct leading diagnosis in challenging polyneuropathy cases, statistically matching non-specialist neurologists (63%) while trailing specialists (74%). The model outperformed non-specialists by correctly including the right diagnosis in its differential list more often and selecting appropriate confirmatory tests at a higher rate.

AI Rivals Non-Specialists on Real Patient Data

Researchers analyzed 100 consecutive cases from two tertiary centers in Milan, converting them into standardized English summaries containing demographics, symptoms, exam findings, electrophysiology, and initial labs. Using a zero-shot chain-of-thought prompt, generative AI diagnosis delivered one leading diagnosis, two differentials, and one recommended test for each case. The same information was given to 19 peripheral-nerve specialists and 17 non-specialists, with performance measured against diagnoses confirmed after at least 12 months of follow-up.

Where the Model Excels and Where It Stumbles

ChatGPT-4o demonstrated particular strength in compiling broader and more accurate differential lists than non-specialists, along with 15-percentage-point better test selection. Its leading-diagnosis sensitivity reached 57.3% with 72.7% precision. Most errors stemmed from overlooking provided clinical details or over-relying on laboratory values, while hallucinations accounted for roughly one-third of mistakes. After seeing the AI output, non-specialists changed their initial assessments in 21.8% of cases, producing measurable gains in accuracy, sensitivity, and F1-score.

Practical Value in Specialist-Scarce Settings

These findings suggest generative AI diagnosis could help reduce unnecessary testing and speed up referrals in primary or secondary care where neurologist access is limited. Because polyneuropathy already consumes substantial healthcare resources and often faces long diagnostic delays, embedding this technology as a calibrated second opinion may shorten time-to-etiology and lower downstream costs. Specialist performance remained largely unchanged after reviewing the AI suggestions, reinforcing that the greatest benefit lies in supporting less-experienced clinicians rather than replacing expert judgment.

This head-to-head evaluation of generative AI diagnosis offers a clear roadmap for integrating large language models into neurologic workflows, provided future prospective trials confirm real-world impact on patient outcomes and resource use.

Reference url

Recent Posts

cemdisiran gMG treatment
Breakthrough in cemdisiran gMG treatment Advances Regulatory Landscape

By João L. Carapinha

June 25, 2026

Cemdisiran gMG treatment has cleared a critical hurdle after the FDA and EMA accepted Regeneron’s regulatory submissions for review in anti-AChR antibody-positive generalized myasthenia gravis (gMG). The investigational siRNA therapy targeting complement protein C5 could become the first subcutan...
Trodelvy ADC approval
Advancements in Breast Cancer Treatment Following Trodelvy ADC Approval

By João L. Carapinha

June 25, 2026

The Trodelvy ADC approval by the European Commission delivers the first antibody-drug conjugate approved for first-line use in adults with unresectable or metastatic triple-negative breast cancer who are ineligible for PD-1 or PD-L1 inhibitors. This
Cemiplimab Cervical Cancer Access
Cemiplimab Cervical Cancer Access Navigating Treatment and Value

By João L. Carapinha

June 24, 2026

Cemiplimab Cervical Cancer Access has been endorsed by NICE for adults with recurrent or metastatic cervical cancer that has progressed after platinum-based chemotherapy, provided patients have not previously received immunotherapy. The final draft guidance confirms that this PD-1 inhibitor deliv...