Article Text
Abstract
Background Artificial intelligence (AI) software is increasingly applied in stroke diagnostics. However, the actual performance of AI tools for identifying large vessel occlusion (LVO) stroke in real time in a real-world setting has not been fully studied.
Objective To determine the accuracy of AI software in a real-world, three-tiered multihospital stroke network.
Methods All consecutive head and neck CT angiography (CTA) scans performed during stroke codes and run through an AI software engine (Viz LVO) between May 2019 and October 2020 were prospectively collected. CTA readings by radiologists served as the clinical reference standard test and Viz LVO output served as the index test. Accuracy metrics were calculated.
Results Of a total of 1822 CTAs performed, 190 occlusions were identified; 142 of which were internal carotid artery terminus (ICA-T), middle cerebral artery M1, or M2 locations. Accuracy metrics were analyzed for two different groups: ICA-T and M1 ±M2. For the ICA-T/M1 versus the ICA-T/M1/M2 group, sensitivity was 93.8% vs 74.6%, specificity was 91.1% vs 91.1%, negative predictive value was 99.7% vs 97.6%, accuracy was 91.2% vs 89.8%, and area under the curve was 0.95 vs 0.86, respectively. Detection rates for ICA-T, M1, and M2 occlusions were 100%, 93%, and 49%, respectively. As expected, the algorithm offered better detection rates for proximal occlusions than for mid/distal M2 occlusions (58% vs 28%, p=0.03).
Conclusions These accuracy metrics support Viz LVO as a useful adjunct tool in stroke diagnostics. Fast and accurate diagnosis with high negative predictive value mitigates missing potentially salvageable patients.
- brain
- CT angiography
- device
- stroke
- technology
Data availability statement
Data are available upon reasonable request. N/A.
Statistics from Altmetric.com
Introduction
Treatment for large vessel occlusion (LVO) stroke has significantly improved in recent years with the advent of endovascular therapy.1 This has fostered changes in stroke systems of care to offer timely access to both intravenous thrombolysis and endovascular therapy. The triage of LVO stroke requires rapid completion, interpretation, and communication of neuroimaging. Historically, interpretation and communication required prioritized attention of a radiologist, and any unavailability or alternate priority could potentially lead to treatment delays.2 In addition, there could be multiple downstream delays in management, since the care of an acute stroke entails a multidisciplinary approach requiring multiple team notifications. Consequently, timely triage of LVO stroke has been challenging.
In recent years, artificial intelligence (AI)-driven software has emerged to streamline acute stroke triage by enabling faster diagnosis and treatment time.3–5 Limited literature to date has shown impressive accuracy of AI-based algorithms in detecting LVO stroke on CT angiography (CTA) scans in retrospective cohorts.6–8 Initial real-world application of commercially available AI software has led to faster transfer times between primary and comprehensive stroke centers and more timely and predictable notification of neuroendovascular teams.9 10
Prior accuracy studies have been limited to comprehensive stroke centers and by scans that were retrospectively analyzed. The diagnostic accuracy of such tools in real time in a real-world hospital network with different levels of stroke center classifications for stroke remains unknown. In this study of diagnostic test accuracy, we aimed to define the diagnostic accuracy of the Viz LVO AI triage software in a series of prospectively collected patients screened for LVO stroke in a large, tiered healthcare network.
Methods
Study design and patient sample
Institutional review board approval with waiver of consent was obtained for prospective collection of data for patients screened for LVO stroke in all hospitals with Viz LVO deployed,in a large, seven-hospital healthcare network that includes primary, thrombectomy capable, and comprehensive stroke centers. Viz LVO was used in the real-world setting, in real time, to triage patients for LVO stroke. At all stroke centers in the network, all stroke code CT scans and CTA scans are immediately transmitted to the Viz.ai cloud for analysis. The stroke team at the respective hospitals is alerted for any stroke code. The stroke and neurointerventional teams on call both receive the alert when an LVO is detected by Viz. The neurointerventional team may also be alerted to potential LVO stroke, based on clinical suspicion. In this study, the accuracy of the software was analyzed.
The study population was a consecutive series group, consisting of patients described with stroke codes who underwent CTA for which a Viz LVO reading was obtained. For the purposes of this study, compliant with Standards for Reporting Diagnostic Accuracy (STARD), Viz LVO served as the index test, while the CTA scan read by neuroradiologists and emergency radiologists was considered the clinical reference standard (‘gold standard’) test with which we compared Viz LVO.11 Catheter angiography was deemed impractical as the clinical reference standard test, since a minority of cases did undergo angiography, typically for performing mechanical thrombectomy for an LVO observed on CTA. No data were missing from our initial dataset. No extra risk was imposed on the study subjects by using the index test, because it includes only electronic analysis of already acquired images. Radiology interpreters were blinded to the Viz LVO readings in all cases.
AI software
Viz LVO is an AI-driven software, cleared by the Food and Drug Administration (FDA) in early 2018, in order to perform computer-assisted triage in patients with suspected LVO. Using a convolutional neural network algorithm, the software is built to detect LVOs on CTA scans and to automatically alert the stroke treatment team. Viz LVO delivers CTA and CT perfusion scans to a mobile application, where they can be viewed with real-time 3D manipulation by the notified personnel. In instances where a vessel’s endpoint is shorter than the threshold or where partial occlusion is suspected, an LVO alert is sent.7 Viz LVO does not provide the reader with the precise location of the occlusion; instead, it offers only a positive or negative LVO notification.
Based on CTA scans, we used surgical anatomic definitions to specify the exact occlusion location. Therefore, M1 was defined as the horizontal part of the middle cerebral artery (MCA) and M2 was defined as the sylvian part of the MCA. The internal carotid artery (ICA) terminus (ICA-T) was defined as the supraclinoid segment of the ICA. The CTA impressions were mined from the final radiologist reports, from which the reference determination of occlusion was made. All impressions that were positive or indeterminate for occlusion were reconciled by an adjudicating reviewer to verify the final location. The adjudicating reviewer was either a fully trained endovascular neurosurgeon (TS) or a neuroradiologist (GL). The index test had no indeterminate results, as Viz LVO offers binary characterization of either positive or negative for LVO.
Viz LVO is trained to detect LVOs of the intracranial circulation in the supraclinoid ICA (ophthalmic, choroidal, and communicating segments) and the M1 (horizontal part) of the MCA but does not assess posterior circulation, infraclinoid ICA, or extracranial carotid occlusions. All occlusions in locations different than the ICA-T, M1, or M2 were excluded from our analysis. A single case with prior M1 stent placement and possible M1 occlusion was also excluded. Finally, CTA scans that were non-diagnostic for any reason, including severely degraded examinations owing to motion or poor contrast injection, were also excluded. This methodology is consistent with similar previous studies.7 12 13 Because Viz LVO implements iterative updates in the application’s algorithm at regular intervals (eg, every 2–3 months), our initial Viz LVO dataset readings were a mix of progressively updated algorithms.
Statistical analysis
Descriptive statistics were reported for the baseline characteristics of our study population, including means and SD for continuous variables and frequencies with percentages for categorical variables. All univariate analyses were conducted using a Χ2 test of independence.
The calculated accuracy metrics included sensitivity, specificity, positive predictive value, negative predictive value, positive likelihood ratio, negative likelihood ratio, overall accuracy, and area under the receiver operating characteristic curve.
Finally, a logistic regression analysis adjusted for both age and gender was conducted, in order to predict the effect of Viz AI on LVO detection. ORs with 95% confidence intervals were used as output. A P value of less than 0.05 was used to determine statistical significance throughout this study. Data were analyzed using SAS (SAS 9.4, Cary, North Carolina, USA).
Results
Between May 2019 and November 2020, a total of 1822 stroke code patients had a CTA scan read by Viz LVO. Among the 190 CTA scans that did demonstrate an occlusion, 142 (74.7%) were located in the ICA-T, or M1/M2 segments of the MCA. The other 48 (25.3%) occlusions were excluded from our analysis, since Viz LVO is not trained to detect them (figure 1). The mean age of the qualifying LVO population (n=142) was 72 (SD=16.2, range 29–101), with just over half female (75, or 52.8%). There were 12 (8.5%) ICA-T, 69 (48.6%) M1, and 61 (43%) M2 LVOs (table 1). Viz LVO was able to detect 100% (12/12) of ICA-T, 93% (64/69) of M1 and 49% (30/61) of M2 occlusions (p<0.01), (table 2). The M2 LVOs were further subdivided into proximal (n=43) and mid/distal (n=18) (table 3). There was a statistically significant difference in the algorithm’s detection rate of proximal versus mid/distal M2 occlusions, as it could correctly detect 58% (25/43) of the proximal and 28% (5/18) of the mid/distal occlusions (p<0.03), (table 3).
Given the specific FDA indications for Viz LVO and the relatively poor performance for detecting M2 occlusions, we analyzed the algorithm’s accuracy separately for two different groups. The first included ICA-T and M1 occlusions (according to the FDA indications for Viz LVO), while the second included ICA-T, M1 and M2 occlusions. The M2 occlusions were completely excluded from the first analysis and were not assigned to the LVO-negative arm of our study so that specificity would be unaffected. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and accuracy were 93.8%, 91.1%, 34.4%, 99.7%, and 91.2% for the ICA-T/M1 group and 74.6%, 91.1%, 42.2%, 97.6%, and 89.8% for the ICA-T/M1/M2 group, respectively. Positive likelihood ratio was 10.56 and negative likelihood ratio as 0.07 for the ICA-T/M1 group, and 8.4 and 0.28 for the ICA-T/M1/M2 group, respectively (table 4).
In addition, we performed a logistic regression analysis adjusted for both age and gender, which demonstrated that Viz LVO can strongly predict LVO in both first (OR=155, 95% CI 61.4 to 391.3, p<0.01) and second groups (OR=29.93, 95% CI 19.54 to 45.84, p<0.01; table 5). Interestingly, age slightly affected the diagnostic accuracy, while gender did not. Importantly, the 95% CIs of the two groups did not overlap, demonstrating a statistically significant difference in the algorithm’s accuracy between these two groups. ROC analysis yielded an area under the curve (AUC) of 0.95 for the first and an AUC of 0.86 for the second group (figure 2).
Discussion
AI-driven software used in stroke diagnostics has been previously shown to be accurate and improve treatment times and reliability of neuroendovascular notifications.3 9 Various recently approved AI tools are commercially available. Nevertheless, their actual accuracy rates in the real-world clinical setting remain unexplored. In this study, Viz LVO demonstrates high sensitivity and NPV as an LVO stroke triage tool for ICA-T and M1 occlusions, which carry the largest evidence of benefit for treatment with thrombectomy. Importantly, the reported accuracy metrics reflect the triaging ability of this AI software prospectively collected in a setting comprising all levels of care, including primary, thrombectomy capable, and comprehensive stroke centers. Examples of false negative cases (missed occlusions) are included in figure 3. The relatively low PPV can be explained by the very low prevalence of LVOs identified in our dataset (5% in the first vs 8% in the second group). As expected, PPV is lower in the first group, which has a lower LVO prevalence. This also explains the very high NPV we observed in our analysis. Sensitivity is higher in the first group. High sensitivity in combination with very high NPV suggests that this AI software is a powerful adjunct in triaging LVO stroke. Delays from CTA acquisition to endovascular team notification can result in treatment delays, potentially portending worse postoperative clinical outcomes. Previously published data from our institution have shown that the median time from CTA acquisition to neuroendovascular team notification was 24 min (IQR 12–47).14 In addition, we have previously published data to show that door-to-neuroendovascular team notification times has been significantly shortened (25.0 min (IQR=12.0) vs 40.0 min (IQR=61.0); p=0.01) with less variation (p<0.05) since Viz LVO was deployed at our health system.10 Future research is directed towards seeking to identify a difference in treatment times, short-term and long-term clinical outcomes between before and after Viz LVO implementation. This study is different from prior reports in that it includes insights into LVO detection rates across a large, tiered healthcare network in real time.
In a smaller single-institution study at a comprehensive stroke center by Yahav-Dovrat et al, Viz LVO was tested on stroke code and non-stroke code patients. Within the stroke code population (n=404) there were 72 ICA-T and M1 occlusions (prevalence=17.8%). The authors reported 82% sensitivity, 96% NPV, and 89% accuracy.7 As expected, PPV for this study was higher than in our study, since the prevalence of LVOs was considerably higher compared with the prevalence of 5% in our ICA-T and M1 group. Noteworthy is the fact that our study population comes from a health system consisting of different levels of stroke centers. Primary stroke centers may have a lower threshold for calling a stroke code, possibly explaining the lower prevalence of LVOs in our study population. Different thresholds for stroke code initiation in different institutions might, therefore, affect the PPV and/or NPV of Viz LVO, in distinction to snesitivity and specificity, which are related to the test itself. The true diagnostic accuracy of a test may be different across various studies, a phenomenon attributed to variation in groups, methods, observers, gold standard test, and other parameters between different studies. In a study by Luijten et al, the diagnostic performance of StrokeViewer (NICO.LAB, Amsterdam, The Netherlands, v2.1.22) was tested retrospectively, in a research setting, in the MR CLEAN Registry and PRESTO studies, separately.15 The authors provided all accuracy metrics for the PRESTO study (72% sensitivity, 78% specificity, 0.75 AUC). However, owing to the LVO-only inclusive nature of the MR CLEAN Registry, only specificity (89%) could be calculated. The methodology of this study did not allow for calculations of clinical treatment metrics. Lastly, the LVO prevalence is 100% in the MR CLEAN Registry and 21.8% (141/646) in the PRESTO study, indicating that the study population may stem primarily from highly specialized centers, probably limiting the validity of the calculated accuracy metrics to comprehensive stroke centers. Our study attempts to challenge the index test in a real-world healthcare system that is inclusive of multiple types of populations.16
The poor M2 occlusion detection rates by Viz LVO have been corroborated by previous research, but without a detailed breakdown of the location of the M2 occlusion. In the work by Luijten et al, the M2 detection rate of the StrokeViewer software differed significantly between MR CLEAN Registry and PRESTO studies, being 72% (95% CI 64% to 78%) in the former and 49% (95% CI 37% to 62%) in the latter (p value not reported15). However, the breakdown of M2 occlusion location is unknown for both of these studies. In this work, we proved that Viz LVO performs superiorly in identifying proximal versus mid/distal M2 occlusions (p<0.03). In a study by Amukotuwa et al the M2 occlusion detection rate was reported as 90%, in a mixed sample of patients stemming from high-tiered hospitals and clinical trials.17 However, all the M2 occlusions (60) were proximal, and 15 (25%) of them were identified in the context of more proximal (ICA and/or M1) tandem occlusions, thereby questioning the validity of the reported metrics. Canon’s AUTO-Stroke Solution LVO detection software is trained in LVOs, including proximal M2, with a reported M2 occlusion detection rate of 80% in a small-sample retrospective study.18 Accumulating data support clinical benefit when M2 occlusions are treated with thrombectomy.19 Therefore, detection of M2 occlusions by AI tools might be one of the most important goals, given the relative decreased accuracy of multiple algorithms for detection in this location.8 13 15 17 18 20
Several small-sample studies have tried to pool patients from both clinical trials and specialized hospitals, in an effort to validate different AI tools (RAPID) in a retrospective fashion, including17 or excluding8 M2 occlusions from the final patient pool. Following a similar methodology to ours, another single-institution study using RAPID CTA, reported 92% sensitivity, 81% specificity, and 97% NPV for all ICA-T, M1 and M2 occlusions.12 In addition, other, smaller-sample preliminary studies of Viz LVO reported sensitivity of 90.1% and 82% and specificity of 82.5% and 94%, respectively.21 22 Sheth et al developed and validated the accuracy of a novel CNN (DeepSymNet) for the detection of LVO and non-LVO strokes.6 The two primary endpoints were accuracy in LVO detection and accuracy in ischemic core detection. For these purposes, the authors assessed both CTA and CT perfusion for all the 297 subjects of the study. AUC for LVO detection was 0.88 for all the cases and 0.88 vs 0.9 for the 224 patients who had had a stroke (ischemic core ≤30 vs ≤50 mL, respectively). Contrary to other studies,8 13 15 17 18 20 we provide the largest consecutive series of real-time, real-life assessment of the diagnostic accuracy of Viz LVO, in a multiple-tiered healthcare system.
Future applications of AI in stroke diagnostics may include the ability to detect LVOs based on a non-contrast CT scan alone by recognizing hyperdense clots, thereby potentially obviating CTA scans.23 The addition of clinical assessment such as the National Insitutes of Health Stroke Scale score to the AI algorithm, may improve accuracy and other metrics.23 Stib et al developed and tested AI software capable of detecting LVOs on multiphase CTA, which has achieved an AUC of 0.89, compared with AUC of 0.74 when single-phase CTA was used.24 These results also showed that specific combinations of two phases (out of the three phases acquired in the multiphase CTA technique) yield statistically significantly better accuracy than the analysis of a single-phase CTA scan alone. The ability of Viz LVO to automatically detect LVOs on CTA, notify the appropriate teams about the findings, and make images readily available may lead to saving a tremendous amount of time by avoiding multiple human steps in the workflow.
Limitations
Our study’s population was pooled from multiple hospitals of different tiers, therefore reflecting patients with suspected stroke in the general population. The ground truth for CTA scans was taken from the reading of one radiologist. Further studies may be enhanced by an interpretation by two independent and blinded experienced team members, but this would be difficult to do in a prospective manner. Another limitation is that Viz LVO is trained to recognize only anterior circulation occlusions, specifically ICA-T and M1. We, therefore, excluded occlusions in other locations, such as M3 occlusions and those in the posterior circulation. Lastly, the Viz LVO outputs used in this study were obtained by algorithms from varying software versions. Consequently, our results might not reliably reflect the algorithm’s current performance. Further research to compare how the algorithm accuracy has changed over time is ongoing.
Conclusion
In this work, the diagnostic accuracy of Viz LVO was tested in real life and real time across one of the to-date largest prospective consecutive cohorts, in a multiple-tiered healthcare system. Viz LVO is promising AI-driven software that can reliably detect ICA-T and M1 LVOs with impressive NPV, sensitivity, and overall accuracy. It is a useful adjunct in triaging patients with a LVO stroke at varying levels of stroke centers.
Supplemental material
Data availability statement
Data are available upon reasonable request. N/A.
Ethics statements
Patient consent for publication
Ethics approval
This study involves human participants and institutional review board approval was obtained by the Human Research Protection Program at the Icahn School of Medicine at Mount Sinai (ISMMS) (19-00956). Informed consent was waived owing to the retrospective nature of the study (19-00956).
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
Twitter @moreyjr917, @faresmarayati, @rdeleacymd, @chriskellnerMD
Contributors JTF and StM conceptualized the project. StM and JM were responsible for gathering and organizing the data. StM and DC conducted the analysis. BND and AD provided the radiology reports. GL and TS served as adjudicating reviewers. StM and JM drafted the manuscript. All authors made edits and approved the final manuscript. StM and JTF revised the manuscript.JTF serves as the guarantor author for this project.
Funding JTF and CPK have received research funding from Viz.ai.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.