Article Text

Original research
Automated detection of large vessel occlusion using deep learning: a pivotal multicenter study and reader performance study
    1. 1Department of Neurology, Daejeon Eulji University Hospital, Daejeon, Daejeon, Korea
    2. 2Artificial Intelligence Research Center, JLK Inc, Seoul, Korea
    3. 3Department of Neurology, Chonnam National University Medical School, Gwangju, Korea
    4. 4Department of Radiology, Seoul National University Bundang Hospital, Seongnam, Gyeonggi-do, Korea
    1. Correspondence to Dr Wi-Sun Ryu; wisunryu{at}gmail.com; Professor Joon-Tae Kim; alldelight2{at}jnu.ac.kr

    Abstract

    Background To evaluate the stand-alone efficacy and improvements in diagnostic accuracy of early-career physicians of the artificial intelligence (AI) software to detect large vessel occlusion (LVO) in CT angiography (CTA).

    Methods This multicenter study included 595 ischemic stroke patients from January 2021 to September 2023. Standard references and LVO locations were determined by consensus among three experts. The efficacy of the AI software was benchmarked against standard references, and its impact on the diagnostic accuracy of four residents involved in stroke care was assessed. The area under the receiver operating characteristic curve (AUROC), sensitivity, and specificity of the software and readers with versus without AI assistance were calculated.

    Results Among the 595 patients (mean age 68.5±13.4 years, 56% male), 275 (46.2%) had LVO. The median time interval from the last known well time to the CTA was 46.0 hours (IQR 11.8–64.4). For LVO detection, the software demonstrated a sensitivity of 0.858 (95% CI 0.811 to 0.897) and a specificity of 0.969 (95% CI 0.943 to 0.985). In subjects whose symptom onset to imaging was within 24 hours (n=195), the software exhibited an AUROC of 0.973 (95% CI 0.939 to 0.991), a sensitivity of 0.890 (95% CI 0.817 to 0.936), and a specificity of 0.965 (95% CI 0.902 to 0.991). Reading with AI assistance improved sensitivity by 4.0% (2.17 to 5.84%) and AUROC by 0.024 (0.015 to 0.033) (all P<0.001) compared with readings without AI assistance.

    Conclusions The AI software demonstrated a high detection rate for LVO. In addition, the software improved diagnostic accuracy of early-career physicians in detecting LVO, streamlining stroke workflow in the emergency room.

    • CT Angiography
    • Stroke
    • Thrombectomy

    Data availability statement

    Data are available upon reasonable request. The data that support the findings of this study are available upon reasonable request and in compliance with local and international ethical guidelines.

    http://creativecommons.org/licenses/by-nc/4.0/

    This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

    Statistics from Altmetric.com

    Request Permissions

    If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

    WHAT IS ALREADY KNOWN ON THIS TOPIC

    • Advancements in stroke imaging have extended the endovascular treatment window for patients with large vessel occlusion (LVO), with CT angiography and artificial intelligence (AI) tools improving detection and treatment efficiency.

    WHAT THIS STUDY ADDS

    • We proved the efficacy of the AI algorithm using multicenter datasets. Additionally, we demonstrated that the AI algorithm improved the sensitivity of LVO detection in early-career physicians.

    HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

    • AI software may streamline stroke workflow by improving the diagnostic accuracy of early-career physicians.

    Introduction

    Advancements in stroke imaging and procedural devices have extended the window for endovascular therapy (EVT) in patients with large vessel occlusion (LVO).1 Recent randomized clinical trials have established a new standard of care for patients with LVO who arrive at the hospital within 6 to 24 hours of their last known well time.2 3 The triage process for these clinical trials primarily relies on magnetic resonance (MR) perfusion or CT perfusion to identify clinical or tissue mismatches.2 3 However, the majority of primary stroke facilities worldwide lack widespread access to these advanced imaging techniques.4 Recent studies have emphasized using more accessible imaging methods like CT angiography (CTA). The CT for Late Endovascular Reperfusion (CLEAR) trial demonstrated comparable clinical outcomes between patients selected using non-contrast CT with CTA and those selected with CT or MR perfusion.5 Furthermore, a sub-study conducted by the HERMES (Highly Effective Reperfusion Evaluated in Multiple Endovascular Stroke Trials) collaboration has expanded on this notion within the early time frame (0–6 hours), showing that the rates of favorable functional outcomes were similar between patients who underwent CT perfusion and those who did not.6

    Initially, 66% of EVT candidates were directed to centers incapable of performing EVT,7 despite having better chances of favorable outcomes at EVT-capable sites. Therefore, it is imperative for non-EVT-capable centers to reliably and promptly detect LVOs at all times—24 hours a day, 7 days a week—facilitating the rapid transfer of patients to EVT-capable facilities. However, the scarcity of vascular experts poses a challenge for many non-EVT-capable centers. Even in EVT-capable centers, an automated tool for screening CTA to detect the presence of LVOs could enhance operational efficiency, optimize staffing, and reduce the time from patient arrival to the initiation of the procedure by facilitating prompt LVO detection.

    This multicenter study aimed to demonstrate the efficacy of a fully automated, deep learning-based software (JLK-LVO, JLK Inc., Republic of Korea) in detecting LVO in patients with acute ischemic stroke. The algorithm’s performance was evaluated against a standard reference established through consensus among stroke experts. Furthermore, we examined whether the implementation of artificial intelligence (AI) software enhances the diagnostic accuracy of early-career physicians compared with their performance without AI assistance.

    Methods

    Study design and data source

    This study is reported in accordance with the STARD (Standards for Reporting of Diagnostic Accuracy Studies) reporting guidelines. From January 2021 to September 2023, we retrospectively included patients with ischemic stroke who were admitted to two university hospitals within 7 days of symptom onset and underwent CTA for evaluation of vessel status. Among 603 eligible patients, we excluded subjects with poor image quality (n=8), severe metallic artifacts (n=4), insufficient contrast filling (n=1), and source image thickness >2 mm (n=3), leaving 595 patients for analysis (online supplemental figure 1). Stroke subtypes were determined using an MR-based stroke classification algorithm by the attending physician at each hospital, as previously described.8

    Supplemental material

    Sample size estimates

    The primary objective is to test whether the sensitivity and specificity of the software are comparable to prespecified criteria based on prior studies.9–13 For the sensitivity analysis, an LVO-positive sample size of 270 patients will have 90% power to detect, with a one-sided test at a 2.5% significance level, a prespecified lower bound of 0.7489. For the specificity analysis, an LVO-negative sample size of 320 patients will have 90% power to detect, with a one-sided test at a 2.5% significance level, a prespecified lower bound of 0.9155. This sample size accounts for a 2% dropout rate.

    Definition of large vessel occlusion

    In this study, LVO was defined as arterial occlusion involving the intracranial segment of the internal carotid artery (ICA), the M1 segment of the middle cerebral artery (MCA-M1), and the M2 segment of the MCA (MCA-M2). The intracranial ICA refers to the segment of the ICA extending from the petrous part to the bifurcation of the MCA and the anterior cerebral artery (ACA).14 The MCA-M1 segment is defined as the portion of the MCA from the MCA-ACA bifurcation to the MCA branching point. The MCA-M2 segment refers to the portion of the MCA ascending vertically along the Sylvian fissure from the MCA branching point.14 In our study, occlusions at the intracranial ICA or MCA-M1 were categorized as intracranial LVO, whereas occlusions at the MCA-M2 without proximal arterial involvement were designated as isolated MCA-M2 occlusions. In cases where the MCA divided early, we employed a functional rather than a traditional definition: the artery segment closest to its origin was designated as the M1 segment, and branches further downstream were classified as M2 segments.15 To ascertain the presence of LVO, two experienced vascular neurologists, each with at least 5 years of experience, meticulously reviewed the CTA source images, maximum intensity projection (MIP) images, and three-dimensional (3D) rendering images, in addition to patients’ MRI scans and symptom data. In instances of labeling disagreement, a third reviewer made the final decision.

    Deep learning-based software

    Source images of CTA with a slice thickness between 0.5 and 2 mm were processed using commercially available deep learning-based software (JLK-LVO, JLK Inc., online supplemental figure 2).16 17 In brief, an automated algorithm selects slices from source images to construct MIP images. The vessel segmentation involves a two-dimensional U-Net based on the Inception Module,18 which is trained to segment vessels in axial MIP images. Following this, an LVO detection algorithm combines the vessel masks into a compressed image, which is used to train an EfficientNetV2 model.19 The saliency map is visualized when the probability of LVO exceeds 10%. The side of the LVO is determined by the software based on the saliency map.

    Study design

    Stand-alone performance

    LVO probability scores predicted by the software were used to evaluate the standalone performance metrics of the software, including the area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). To construct the confusion matrix, we set the threshold for LVO probability at 50%. We defined a true positive when the predicted side of the LVO from the AI software matched the expert consensus, except in cases of bilateral occlusion. If the diagnosis was correct but the side of the LVO was discordant between the AI software and the expert consensus, we designated the case as a false negative. For bilateral occlusions, a true positive was defined when the AI software-generated heatmap was present on both sides, with the smaller side being at least 50% of the larger side.

    Reader assessment study

    A retrospective, crossover-designed trial was conducted to assess the efficacy of software in aiding diagnostic decisions for detecting LVO in CTA. The multi-reader study involved four residents specializing in the care of ischemic stroke from the fields of radiology, neurology, neurosurgery, and emergency medicine. Before the evaluation, the reviewers were divided into groups A and B. Group A was provided with original CTA images, including MIP and 3D rendering images, along with the AI software’s results (with AI assistance; online supplemental figures 1 and 3). Conversely, group B received only the original CTA images, including MIP and 3D rendering images (without AI assistance). The software presented segmented and merged vessel images with a heatmap and an LVO probability score. A custom image viewer (MEDIHUB Stroke, JLK Inc., stroke.medihub.ai) was utilized to visualize the CTA and the software’s output. The reviewers were blinded to the standard reference confirmed by expert consensus. A second assessment was conducted 4–5 weeks after a washout period,20 during which study subjects were shuffled and randomly allocated new study numbers. In this second assessment, group A reassessed the CTA images without the aid of the software’s findings (without AI assistance), while group B reassessed the CTA images with the software’s results (with AI assistance). Readers determined the presence and the side of LVO and assigned a confidence score for the presence of LVO using a 5-point scale.

    Statistical analysis

    Using the t-test or rank sum test for continuous variables, and the χ2 test for categorical variables as appropriate, we compared baseline characteristics stratified by the presence of LVO. To evaluate the accuracy of the software in diagnosing LVO, we computed the AUROC, as well as sensitivity, specificity, PPV, and NPV. A 1000-repeat bootstrap analysis was employed to calculate the 95% confidence intervals (95% CI) for all parameters. The AUROC was used in combination with the DeLong method21 to compute the standard error of the AUROC. The cut-off for the LVO probability score used in the analysis was set at 0.5. We conducted an additional analysis to determine the optimal threshold that would yield the maximum Youden index (sensitivity+specificity−1). Given that the software is primarily intended for screening LVO, we also computed specificity, PPV, and NPV at the fixed sensitivity level of 0.90.

    After dividing the subjects into isolated MCA-M2 occlusion and intracranial LVO groups, we reran the analysis for subgroup analysis. In this analysis, patients without LVO were included as the control group for both subgroups. Considering the EVT time window, we repeated the analysis after excluding patients whose onset to CTA time exceeded 24 hours. After stratifying subjects by stroke subtype, we reran the analysis.

    The Obuchowski-Rockette method was used for analyzing multireader multicase (MRMC) studies, along with the MRMCaov library,22 for all analyses of improvement of diagnostic performance with versus without AI assistance. This method tested the null hypothesis that the average AUROC of the readers without AI assistance was equal to that with AI assistance. The Obuchowski-Rockette method accounts for the fact that, in an MRMC study, the same cases are evaluated by each reader. Consequently, the error terms are assumed to be equi-covariant among readers and cases, rather than independent. We also calculated sensitivity, specificity, accuracy, and their differences between with and without AI assistance. A P value <0.05 was considered statistically significant. The analyses were conducted using R version 4.2.3, STATA software (version 16.0, College Station, TX) and MedCalc (version 17.2, MedCalc Software, Ostend, Belgium, 2017).

    Results

    Patient characteristics

    Among 595 patients, 275 (46.2%) were diagnosed with LVO. Specifically, 213 patients had intracranial LVO, and 62 patients exhibited isolated MCA-M2 occlusion. Details regarding the occlusion sites are presented in online supplemental table 1. The mean±SD age was 68.5±13.4 years and 332 (56%) were male. The median time interval between the last known well time to CTA was 46 hours (IQR 11.8–64.4). Patients with LVO were older, had shorter time intervals between the onset of symptoms and imaging, and presented with more severe strokes compared with those without LVO (table 1).

    Table 1

    Baseline characteristics stratified by the presence of large vessel occlusion

    Stand-alone performance of AI software

    The software achieved an AUROC of 0.961 (95% CI 0.945 to 0.976) (online supplemental figure 4A) at a cut-off point of 0.50. The sensitivity, specificity, PPV, and NPV were 0.858 (95% CI 0.811 to 0.897), 0.969 (95% CI 0.943 to 0.985), 0.959 (95% CI 0.927 to 0.980), and 0.888 (95% CI 0.850 to 0.919), respectively (table 2). The highest Youden index was observed at the optimal cut-off point of 0.362, yielding sensitivity, specificity, PPV, and NPV values of 0.880 (95% CI 0.836 to 0.916), 0.953 (95% CI 0.924 to 0.974), 0.942 (95% CI 0.906 to 0.967), and 0.902 (95% CI 0.866 to 0.932), respectively. At a given sensitivity of 0.90, the specificity was 0.916 (95% CI 0.825 to 0.968).

    Table 2

    Diagnostic performance of software detecting large vessel occlusion

    When limiting the analysis to intracranial LVO, the AUROC was 0.961 (95% CI 0.940 to 0.975) (online supplemental figure 4B). The sensitivity and specificity at the cut-off point of 0.5 were 0.873 (95% CI 0.821 to 0.915) and 0.969 (95% CI 0.943 to 0.985), respectively (online supplemental table 2). Restricting the analysis to isolated MCA-M2 occlusion, the software demonstrated an AUROC of 0.928 (95% CI 0.895 to 0.961) (online supplemental figure 4C), with a sensitivity of 0.692 (95% CI 0.578 to 0.792) and a specificity of 0.960 (95% CI 0.932 to 0.978).

    Considering the EVT time window, the analysis was restricted to those whose symptom onset to imaging was within 24 hours (n=195), of which 109 (55.9%) demonstrated LVO. The software exhibited an AUROC of 0.973 (95% CI 0.939 to 0.991) (online supplemental figure 5). With the cut-off point set at 0.5, the sensitivity and specificity were 0.890 (95% CI 0.817 to 0.936) and 0.965 (95% CI 0.902 to 0.991), respectively.

    After excluding 14 patients (2.4%) who lacked information on stroke subtype, 70 patients (11.8%) were classified as having large artery atherosclerosis, 161 (27.1%) as having cardioembolism, 205 (34.5%) as being undetermined, 139 (23.4%) as having small vessel occlusion, and six (1.0%) as having other-determined strokes. In the large artery atherosclerosis group, where 42 (60%) had LVO, the software achieved an AUROC of 0.953 (95% CI 0.874 to 0.989), with a sensitivity of 0.833 (95% CI 0.686 to 0.930) and a specificity of 1.00 (0.877 to 1.000) at a threshold of 0.5 (online supplemental figure 6). In the cardioembolism group, where 110 (68.3%) had LVO, the software achieved an AUROC of 0.965 (95% CI 0.924 to 0.988), with a sensitivity of 0.909 (95% CI 0.839 to 0.956) and a specificity of 0.980 (95% CI 0.896 to 1.000) at a threshold of 0.5. In the undetermined stroke group, where 108 (52.7%) had LVO, the software exhibited an AUROC of 0.954 (95% CI 0.916 to 0.978), with a sensitivity of 0.815 (95% CI 0.729 to 0.883) and a specificity of 0.969 (95% CI 0.912 to 0.994).

    Reader assessment study

    For all readers, the sensitivities of reading with AI assistance versus without were 91.82% (95% CI 90.04% to 93.37%) and 87.81% (95% CI 85.73% to 89.68%), respectively, with a mean difference of 4.00% (95% CI 2.17% to 5.84%, P<0.001) (table 3). The specificities were 95.70% and 96.30%, respectively, with no statistical difference observed. Reading with AI assistance yielded higher accuracy, with a mean difference of 0.76% (95% CI 0.01% to 1.50%, P=0.049). The average AUROC with AI assistance (0.967, 95% CI 0.956 to 0.983) was significantly higher compared with that without AI assistance (0.944, 95% CI 0.906 to 0.977, P<0.001), with the difference of 0.024 (0.015 to 0.033) (figure 1).

    Figure 1

    Diagnostic performance of readers without versus with artificial intelligence (AI) assist. Average reader receiver operating characteristic curves for detecting large vessel occlusion under two reading conditions: with and without AI assistance. Average area under the receiver operating characteristic curve (AUROC) was computed across four readers participating in the study using the Obuchowski-Rockette method, which accounts for the multireader multicase study design.

    Table 3

    Performance of readers with versus without artificial intelligence assistance

    Discussion

    In this multicenter study, fully automated AI software for detecting LVO achieved a sensitivity of 86% and a specificity of 97%. For isolated MCA-M2 occlusion, the AI software attained a sensitivity of 69% and a specificity of 96%. Additionally, the reader assessment study demonstrated that AI assistance significantly improved the sensitivity of LVO detection among early-career physicians in stroke care. To the best of our knowledge, this is the first study to prove the stand-alone efficacy and the enhancement in reader performance using AI assistance across multicenter datasets in detecting LVO on CTA.

    A few AI software packages have been implemented in clinical practice. RAPID LVO (iSchemaView, Menlo Park, CA), that primarily relies on vessel density threshold assessment, showed higher performance in a pooled cohort from two stroke trials, with a sensitivity of 95% and specificity of 79%.23 However, RAPID LVO requires different threshold settings based on the occlusion site (ICA, MCA-M1, or MCA-M2), which may pose challenges for early-career physicians in interpreting the results. Furthermore, recent studies have demonstrated a lower PPV for RAPID LVO, which may contribute to alarm desensitization, leading to missed alarms or delayed responses.24 Viz LVO (Viz.AI, San Francisco, CA) and CINA LVO (Avicenna.ai, La Ciotat, France), using an end-to-end deep learning algorithm to detect LVO, showed high sensitivity and specificity in recent studies, with sensitivity ranging from 76% to 94%.11 13 25 In the present study, as a stand-alone tool, the software achieved a sensitivity of 87% and a specificity of 97% in detecting intracranial LVO largely comparable or superior to other software packages.

    Recent studies have begun to shed light on the potential benefits of EVT for patients with MCA-M2 segment occlusions, broadening the traditional focus from proximal LVO to include more distal vessels. A meta-analysis revealed that patients with MCA-M2 occlusions treated with EVT showed improved functional outcomes at 90 days compared with those receiving standard medical therapy alone.26 Nevertheless, detecting isolated MCA-M2 occlusions remains challenging, even for experienced clinicians. In a study involving 520 patients with ischemic stroke, of which 16% had LVO and 40 patients had isolated MCA-M2 occlusion, experienced neuroradiologists missed 26% of MCA-M2 occlusions at initial CTA evaluation.27 Additionally, the study highlighted that non-neuroradiologists had a fivefold higher risk of missing LVO in a multivariable analysis.27 In our study, the software achieved a sensitivity of 69% in detecting isolated MCA-M2 occlusion, which is higher than that of other automated LVO detection software packages in patients with isolated MCA-M2 occlusion.11 13 23–25 This disparity may have resulted from the inclusion of isolated MCA-M2 occlusion as LVO in the training dataset for the software.

    The reader assessment study of our research highlights another critical aspect of AI in stroke care: the improvement of diagnostic accuracy in early-career physicians. The observed improvement in sensitivity among early-career physicians when using AI assistance not only validates AI’s role as a diagnostic aid but also addresses a significant challenge in stroke care—the scarcity of vascular experts, especially in areas with limited resources. By offering a high degree of sensitivity and specificity in LVO detection, the software can democratize access to high-quality stroke diagnostics, ensuring more patients are correctly identified for EVT, regardless of their geographical location or the immediate availability of stroke specialists.

    Our study is subject to limitations inherent in its retrospective design and the potential for selection bias. Additionally, the real-world efficacy of the software may vary due to differences in imaging equipment, protocols, and patient demographics across healthcare settings. Furthermore, the prevalence of LVO associated with intracranial arterial stenosis is notably higher in Asian populations,28 potentially introducing complexity into our dataset for deep learning analysis. This complexity arises because LVOs associated with intracranial arterial stenosis typically exhibit an increased number of collateral vessels29 and present with less distinct LVO characteristics when contrasted with LVOs resulting from cardioembolic occlusions. These factors underscore the need for prospective, multicenter studies to further validate our findings and explore the integration of the software into diverse clinical workflows. Despite the lack of clinical evidence supporting EVT in basilar artery occlusion, the next version of the software should address EVT-amenable arterial occlusions in the posterior circulation, given the severity of these cases.

    In conclusion, we have suggested the clinical efficacy of the software for detecting LVO in this multicenter study in patients with acute ischemic stroke. Furthermore, we demonstrated that AI-assisted reading significantly increases sensitivity of LVO detection in early-career physicians. By facilitating the early identification of patients eligible for EVT, the software has the potential role to improve clinical outcomes of stroke patients.

    Data availability statement

    Data are available upon reasonable request. The data that support the findings of this study are available upon reasonable request and in compliance with local and international ethical guidelines.

    Ethics statements

    Patient consent for publication

    Ethics approval

    For this diagnostic study, the recruitment, use, analysis, and prospective testing of radiographic images were approved by the institutional review board of Chonnam National University Hospital (CNUH-2023-311) and Daejeon Eulji University Hospital (2023-08-007).

    References

    Supplementary materials

    • Supplementary Data

      This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Footnotes

    • JGK, SYH and Y-RK contributed equally.

    • Contributors J-TK and W-SR had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. JGK, SYH and Y-RK contributed equally to this manuscript. Concept and design: W-SR, DK, J-TK. Acquisition, analysis, or interpretation of data: W-SR, J-TK, JGK, Y-RK. Drafting of the manuscript: W-SR, JGK, SYH, Y-RK, ML, LS, J-TK. Critical review of the manuscript for important intellectual content: W-SR, LS, J-TK, ML. Statistical analysis: SYH, W-SR. Obtained funding: DK. Administrative, technical, or material support: HH, DK, ML. Guarantor: J-TK.

    • Funding This study was supported by the Multiministry Grant for Medical Device Development (KMDF_PR_20200901_0098), funded by the Korean government and JLK Inc.

    • Disclaimer The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

    • Competing interests SYH, HH, DK, ML and W-SR are employees of JLK Inc, Seoul, Republic of Korea.

    • Provenance and peer review Not commissioned; externally peer reviewed.

    • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.