Article Text
Abstract
Background Endovascular thrombectomy improves outcomes and reduces mortality for large vessel occlusion (LVO) and is time-sensitive. Computer automation may aid in the early detection of LVOs, but false values may lead to alarm desensitization. We compared Viz LVO and Rapid LVO for automated LVO detection.
Methods Data were retrospectively extracted from Rapid LVO and Viz LVO running concurrently from January 2022 to January 2023 on CT angiography (CTA) images compared with a radiologist interpretation. We calculated diagnostic accuracy measures and performed a McNemar test to look for a difference between the algorithms’ errors. We collected demographic data, comorbidities, ejection fraction (EF), and imaging features and performed a multiple logistic regression to determine if any of these variables predicted the incorrect classification of LVO on CTA.
Results 360 participants were included, with 47 large vessel occlusions. Viz LVO and Rapid LVO had a specificity of 0.96 and 0.85, a sensitivity of 0.87 and 0.87, a positive predictive value of 0.75 and 0.46, and a negative predictive value of 0.98 and 0.97, respectively. A McNemar test on correct and incorrect classifications showed a statistically significant difference between the two algorithms’ errors (P=0.00000031). A multiple logistic regression showed that low EF (Viz P=0.00125, Rapid P=0.0286) and Modified Woodcock Score >1 (Viz P=0.000198, Rapid P=0.000000975) were significant predictors of incorrect classification.
Conclusion Rapid LVO produced a significantly larger number of false positive values that may contribute to alarm desensitization, leading to missed alarms or delayed responses. EF and intracranial atherosclerosis were significant predictors of incorrect predictions.
- CT Angiography
- Technology
- CT
- Stroke
- Thrombectomy
Data availability statement
Data are available upon reasonable request.
Statistics from Altmetric.com
WHAT IS ALREADY KNOWN ON THIS TOPIC
Endovascular thrombectomy has been shown to decrease disability, improve functional outcomes, and reduce mortality over standard medical management for anterior large vessel occlusion (LVO) and is time-sensitive. Implementation of automated LVO detection software on CT angiography has been shown to decrease time to puncture and other workflow metrics. Automated system alarms with many false values have been shown to lead to alarm fatigue, delayed reads, and the ignoring of these systems altogether.
WHAT THIS STUDY ADDS
This study compares the diagnostic accuracy of two commercially available LVO detection automation software packages, Viz LVO and Rapid LVO, implemented in two community-based comprehensive stroke centers. The study also looks for predictors of incorrect classifications like low ejection fraction or intracranial atherosclerosis to give algorithm developers insights to improve the accuracy of their systems to prevent undue burden to providers. The study also looks at how these algorithms are tuned to focus on reducing false negative values at the expense of false positive values and false positive values can negatively impact the practitioners that use these systems.
HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY
This study provides important insight into predictors of these false values for algorithm developers when improving the accuracy of their algorithms. The study also helps providers understand how these algorithms are tuned to properly use the results. Results of this study show that these two software packages favor preventing false negative values due to the risks associated with LVO at the expense of false positive values. Large numbers of false positive values can lead to alarm fatigue and overwhelm providers.
Introduction
Large vessel occlusions (LVOs) are responsible for the majority (90%) of mortality and severe disability in patients who experience ischemic strokes.1 There is some variation in terminology, but initial studies primarily defined LVOs as occlusions of the intracranial internal carotid artery (ICA) or the first segment of the middle cerebral artery (M1), and a subset of these studies also included the second segment of the middle cerebral artery (M2) and the first or second segment of the anterior cerebral artery (A1 or A2).2–4 Endovascular thrombectomy has been shown to decrease disability, improve functional outcomes, and decrease mortality over standard medical management for anterior LVOs and is time-sensitive.5 6 Previous studies have suggested worse outcomes with increased time to reperfusion and have even estimated the pace of neural circuitry loss in human ischemic stroke over time.7 8 LVOs can produce large infarcts if timely intervention is not performed to salvage ischemic brain tissue. Hence, finding LVOs early using CT angiography (CTA) is paramount.9–11 Given the time-sensitive nature of these interventions, computer automation may aid in the early detection of these LVOs, and multiple software packages have been released for the early detection of these occlusions.12 13 Previous studies have shown that these software systems can process an image in less than 8 min and can decrease the door-to-needle time by earlier notification of the neurointerventionalist.13–15 Also, implementation of these software systems has been shown to decrease workflow metric times such as CTA to stroke team notification time and CTA to angiosuite arrival time, and in a spoke and hub model it has been shown to decrease transfer times.16–18 Still, false values may lead to alarm fatigue, delayed reads, and the ignoring of these systems altogether.19 In this article, we evaluate two software packages that use different algorithms, Viz LVO and Rapid LVO, for the automated detection of anterior LVOs. The Rapid LVO algorithm performs elastic image registration for bone removal followed by multiscale vessel enhancement filtering and then compares the hemispheric density of the sum of all voxels.15 In simpler terms, the software warps the CTA onto a known image and removes bone tissue based on location and brightness of the voxels, and then runs an algorithm to detect the vessels and calculates their brightness and density as compared with the other side of the brain. Viz LVO utilizes a three-dimensional convolutional neural network to segment images, and the extracted features are fed into a random forest classifier to detect LVOs.20 Viz LVO removes scans with metal and poor bolus timing and removes any bone tissue, and uses an artificial intelligence to segment the images and extract dozens of different variables associated with the vessels, like the length and brightness of the vessel. These data are sent into another machine learning algorithm and outputs a classification. Viz LVO required a set of prelabeled CTAs to train it to detect LVOs. Both of these algorithms require the developer of the software or a user to set a cut-off based on the output to say whether or not it has detected an LVO. These packages have previously been evaluated independently, but to our knowledge this is the first direct comparison of these two software packages on the same dataset at two comprehensive stroke centers.15 21–28
Methods
Stroke data were reviewed from two HCA Houston Healthcare facilities, HCA Houston Kingwood and HCA Houston Clear Lake, from January 2022 to January 2023 from multimodal CT with thin-slice CTA for suspected LVO. Two automated LVO detection software, Rapid LVO version 5.2.2 and Viz LVO version 1.16, were used to analyze these images, and an attending diagnostic neuroradiologist or neurointerventional radiologist interpreted the scans. The software was running simultaneously due to overlap in the contracts with these two companies. We manually abstracted the data from the radiology reads from 371 scans. We removed 11 samples that were missing data resulting in a cohort of 360 patients. We calculated summary statistics using the reads from Rapid LVO and Viz LVO, with the neuroradiologist interpretation as the gold standard. We also ran a McNemar test, a widely accepted non-parametric statistical test for determining whether one machine learning algorithm makes different proportions of classification errors than another on a machine learning task.29 We formed a contingency table of the correct and incorrect classifications of Viz LVO and Rapid LVO and performed a McNemar test on the table, which looks at the marginal homogeneity of the contingency table. High-grade atherosclerosis was determined by radiologist interpretation, and we manually reviewed all images for a Modified Woodcock Score of 2 or greater. To calculate the Modified Woodcock Score, the segment of each ICA from the petrous apex to the ICA bifurcation was qualitatively evaluated based on the thickness and continuity of the calcifications. These visual scores have been shown to correlate with quantitative measures of intracranial ICA calcification measures.30 The ejection fraction (EF) was determined based on ultrasounds obtained from certified echocardiography technicians and calculated by board-certified cardiologists using their preferred method. We also looked for coils, clips or other metal and intracranial hemorrhage. We also abstracted demographic data and past medical history from the electronic medical record. Variables included age, self-identified sex and race, diabetes, hypertension, atrial fibrillation, hyperlipidemia, congestive heart failure, prior stroke, and smoking. These became the independent variables for a multiple logistic regression to determine which variables predicted incorrect classification by these algorithms.
Results
A total of 360 participants were included with a mean (SD) age of 65±16.5 years old, with 160 males and a total of 47 LVOs confirmed by diagnostic or neurointerventional radiologists. Rapid LVO had a specificity of 0.85 and a sensitivity of 0.87, with a positive predictive value (PPV) of 0.46 and a negative predictive value (NPV) of 0.97. Viz LVO had a specificity of 0.96 and a sensitivity of 0.87, with a PPV of 0.75 and an NPV of 0.98. The Rapid LVO and Viz LVO confusion matrices are shown in figure 1. The McNemar test performed on the correct and incorrect classification of the two types of software showed a statistically significant difference between classifications by the two algorithms (P=0.00000031). The contingency table is shown in figure 2. This indicates that the Rapid LVO and Viz LVO have a different relative proportion of errors. Rapid LVO and Viz LVO incorrectly classified radiologist-confirmed LVOs in six of the scans. The incorrectly classified occlusions were located in the ICA, M1, and M2 on the left and right sides, and four of the incorrectly classified occlusions were common between the two algorithms.
We took the data manually extracted from the electronic health record relating to demographics, comorbidities, atherosclerosis on imaging, and EF and performed a multiple logistic regression on the incorrect predictions of the algorithms. Viz LVO results are shown in table 1 and Rapid LVO results are shown in table 2. On Viz LVO, low EF (p=0.00125) and Modified Woodcock Score >1 (P=0.000198) were significant predictors of incorrect classification. When using Rapid LVO, lower EF (P=0.0286) and Modified Woodcock Score >1 (P=0.000000975) were also significant predictors of incorrect classification. The hospital where the scans were taken was not a statistically significant predictor of incorrect classification in either software package. Comorbidities and medical history like diabetes, hypertension, atrial fibrillation, hyperlipidemia, congestive heart failure, prior stroke, and smoking were not significant predictors of incorrect classification in either of these two software packages.
Discussion
Computer automation using different methods like deep learning, artificial intelligence, and other techniques is an emerging tool in the field of neuroradiology and can be helpful in the detection of LVOs. LVOs are responsible for a large portion of the mortality and severe disability in patients who experience ischemic strokes, and their detection is time sensitive.1 This study retrospectively evaluated the performance of two automated LVO detection tools for patients at two comprehensive stroke centers who underwent multimodal CT for a suspected acute ischemic stroke. Of these emerging software packages, both Rapid and Viz have shown promise in prior studies. Previous versions of Rapid LVO version 4.9 have been evaluated independently compared with physicians in previous studies that found that the software has a sensitivity of 0.92, a specificity of 0.81, a PPV of 0.58, and an NPV of 0.97.26 A previous version of the Viz LVO algorithm had a sensitivity of 0.81, a specificity of 0.96, a PPV of 0.65, and an NPV of 0.99.22 To our knowledge, our study is the first to compare two modern versions of these software systems directly on the same image set. Health systems typically have one software or the other, but due to a crossover in contracts at two comprehensive stroke centers in our hospital system, we had the opportunity to evaluate both Rapid and Viz capabilities.
In our retrospective performance analysis, Rapid LVO had a specificity of 0.85 and a sensitivity of 0.87, with a PPV of 0.46 and an NPV of 0.97. Viz LVO had a specificity of 0.96 and a sensitivity of 0.87, with a PPV of 0.75 and an NPV of 0.98. In other terms, out of every 100 scans Rapid LVO and Viz LVO correctly detected 11 LVOs, and they both missed six LVOs, but Rapid LVO incorrectly identified 13 of the scans as LVOs while Viz LVO incorrectly identified four of the scans as LVOs. The values obtained are consistent with previous performance evaluations of these two algorithms.22 26 Of the 47 scans that were determined to have LVOs, both Rapid LVO and Viz LVO detected 41 (87.23%) of them. Rapid LVO had a larger number of false positives than Viz LVO. Repeated false positive alarms can desensitize the users to these alarms, and many users may shut them off altogether.19 Each of these algorithms requires a cut-off, and there is a trade-off between sensitivity, specificity, NPV, and PPV. Different algorithms will have a different combination of these values, but these systems select cut-offs to limit the number of false negatives because a missed LVO can be catastrophic.
We also performed an analysis of potential predictors of incorrect classification. We found that a lower EF predicted incorrect classification in both software packages. We also found that intracranial atherosclerosis, as defined by a Modified Woodcock Score >1, predicted incorrect classification in these software packages. The other variables we evaluated, such as age, race, prior stroke, smoking, atrial fibrillation, etc, were not significant predictors of incorrect classification. The site where the scans were performed did not predict an incorrect classification.
There are some limitations to this study. The PPV increases as the prevalence increases, and the NPV is inversely related. Institutions with different thresholds to perform CTAs on patients and variations in disease prevalence will cause the PPV and NPV to vary across institutions. This may make it difficult to generalize results to other institutions. There is some inherent variation in CTA reads of steno-occlusion by radiologists, but we believe the reads by diagnostic and interventional neuroradiologists provide an acceptable gold standard. Also, we did not collect data on which brand of CT scanner was used to obtain the images. Previous studies have looked at whether the sensitivity and specificity changed across CT scanner brands, and it does not appear that the type of scanner had any significant impact on algorithm performance, but future studies could look more closely at this.13 We did not find intracranial hemorrhage (ICH) or metal (clips, coils, etc) to be a significant predictor of incorrect classification, but the number of patients with ICH and metal was small (n=27), and a step in the Viz LVO algorithm is to remove these samples. Also, Rapid LVO will likely add this step to their algorithm in future versions. A future study with larger numbers of ICH and LVO may be able to determine whether ICH impacts the detection of LVOs by these algorithms. EF and intracranial atherosclerosis are predictors, but there may be some confounding variables that we did not measure in this study. Additionally, we were not able to assess for alarm fatigue during this study and were unable to determine thresholds for alarm fatigue among team members.
Conclusion
Rapid LVO produced a significantly larger number of false positive values than Viz LVO. False positive values can be a source of alarm desensitization, leading to missed alarms or delayed responses. We hope these data will be used to provide insight into the causes and predictors of incorrect classification in these algorithms, like EF and intracranial atherosclerosis, and will improve future versions of these algorithms.
Data availability statement
Data are available upon reasonable request.
Ethics statements
Patient consent for publication
Ethics approval
HCA Houston Healthcare Kingwood Institutional Review Board has determined this retrospective research activity to be exempt or excluded from Institutional Review Board (IRB) oversight in accordance with current regulations and institutional policy. Our internal reference number for this determination is 2022-1055. There was no direct patient contact in performing this study. In addition, our patients sign a data usage form at registration related to data collection and utilization of their data for research. The research was overseen in our research protocol submitted for IRB review and research committee who holds monthly ethics reviews.
References
Footnotes
X @jordantorres_md
Correction notice Since this article first published, the following has been added to the COI section. 'HCA Healthcare has a minority equity stake in VIZ.ai and also purchases products from the company for use in its facilities'.
Contributors ME, YA, AH, AD and RE all contributed to the design of the study and formation of the data dictionary. ME and YA were involved in data collection. AD performed the statistics. All authors contributed to interpreting these results. AD, CH, EP, and JT were involved in drafting the manuscript. AH, ME and YA performed revisions of this manuscript. All authors were involved in data interpretation and article revision. ME is the primary investigator and coordinated data collection at multiple sites. ME is the guarantor and accepts full responsibility for the finished work and the conduct of the study, had access to the data, and controlled the decision to publish.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Disclaimer This research was supported (in whole or in part) by HCA Healthcare and/or an HCA Healthcare affiliated entity. The views expressed in this publication represent those of the author(s) and do not necessarily represent the official views of HCA Healthcare or any of its affiliated entities.
Competing interests HCA Healthcare has a minority equity stake in VIZ.ai and also purchases products from the company for use in its facilities. AH is a consultant and a speaker for Viz.ai, which develops the Viz LVO algorithm that we tested, and he is also involved in a study sponsored by Viz.ai called the LVO Synchronise Study, which looks at the impact of a Viz LVO implementation on patient timing and outcomes. Our other authors do not have any direct conflicts of interest. In the interest of full disclosure, here is a list of conflicts of related companies that are in a related field that we do not believe directly impact this study. YA has stock in Sanavention inc. ME has stock in Galaxy therapeutics. AH is a Consultant/Speaker for Medtronic, Microvention, Stryker, Penumbra, Cerenovus, Genentech, GE Healthcare, Scientia, Balt, Viz.ai, Insera therapeutics, Proximie, NeuroVasc, NovaSignal, Vesalio, Rapid Medical, Imperative Care and Galaxy Therapeutics. AH is also a principal investigator of COMPLETE study – Penumbra, LVO SYNCHRONISE – Viz.ai, Millipede Stroke Trial - Perfuze, RESCUE - ICAD - Medtronic. AH is also on the Steering Committee/Publication Committee of SELECT, DAWN, SELECT 2, EXPEDITE II, EMBOLISE, CLEAR, ENVI, DELPHI, and DISTALS. He is also DSMB of the COMAND trial.
Provenance and peer review Not commissioned; externally peer reviewed.