by Melissa Rochon, Judith Tanner, James Jurkiewicz, Jacqueline Beckhelling, Akuha Aondoakaa, Keith Wilson, Luxmi Dhoonmoon, Max Underwood, Lara Mason, Roy Harris, Karen Cariaga
IntroductionSurgical patients frequently experience post-operative complications at home. Digital remote monitoring of surgical wounds via image-based systems has emerged as a promising solution for early detection and intervention. However, the increased clinician workload from reviewing patient-submitted images presents a challenge. This study utilises artificial intelligence (AI) to prioritise surgical wound images for clinician review, aiming to efficiently manage workload.
Methods and analysisConducted from September 2023 to March 2024, the study phases included compiling a training dataset of 37,974 images, creating a testing set of 3,634 images, developing an AI algorithm using ’You Only Look Once’ models, and conducting prospective tests compared against clinical nurse specialists’ evaluations. The primary objective was to validate the AI’s sensitivity in prioritising wound reviews, alongside assessing intra-rater reliability. Secondary objectives focused on specificity, positive predictive value (PPV), and negative predictive value (NPV) for various wound features.
ResultsThe AI demonstrated a sensitivity of 89%, exceeding the target of 85% and proving effective in identifying cases requiring priority review. Intra-rater reliability was perfect, achieving 100% consistency in repeated assessments. Observations indicated variations in detecting wound characteristics across different skin tones; sensitivity was notably lower for incisional separation and discolouration in darker skin tones. Specificity remained high overall, with some results favouring darker skin tones. The NPV were similar for both light and dark skin tones. However, the NPV was slightly higher for dark skin tones at 95% (95% CI: 93%-97%) compared to 91% (95% CI: 87%-92%) for light skin tones. Both PPV and NPV varied, especially in identifying sutures or staples, indicating areas needing further refinement to ensure equitable accuracy.
ConclusionThe AI algorithm not only met but surpassed the expected sensitivity for identifying priority cases, showing high reliability. Nonetheless, the disparities in performance across skin tones, especially in recognising certain wound characteristics like discolouration or incisional separation, underline the need for ongoing training and adaptation of the AI to ensure fairness and effectiveness across diverse patient groups.