Background Machine learning algorithms hold the potential to contribute to fast and accurate detection of large vessel occlusion (LVO) in patients with suspected acute ischemic stroke. We assessed the diagnostic performance of an automated LVO detection algorithm on CT angiography (CTA).
Methods Data from the MR CLEAN Registry and PRESTO were used including patients with and without LVO. CTA data were analyzed by the algorithm for detection and localization of LVO (intracranial internal carotid artery (ICA)/ICA terminus (ICA-T), M1, or M2). Assessments done by expert neuroradiologists were used as reference. Diagnostic performance was assessed for detection of LVO and per occlusion location by means of sensitivity, specificity, and area under the curve (AUC).
Results We analyzed CTAs of 1110 patients from the MR CLEAN Registry (median age (IQR) 71 years (60–80); 584 men; 1110 with LVO) and of 646 patients from PRESTO (median age (IQR) 73 years (62–82); 358 men; 141 with and 505 without LVO). For detection of LVO, the algorithm yielded a sensitivity of 89% in the MR CLEAN Registry and a sensitivity of 72%, specificity of 78%, and AUC of 0.75 in PRESTO. Sensitivity per occlusion location was 88% for ICA/ICA-T, 94% for M1, and 72% for M2 occlusion in the MR CLEAN Registry, and 80% for ICA/ICA-T, 95% for M1, and 49% for M2 occlusion in PRESTO.
Conclusion The algorithm provided a high detection rate for proximal LVO, but performance varied significantly by occlusion location. Detection of M2 occlusion needs further improvement.
- CT Angiography
Data availability statement
No data are available.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
CT angiography (CTA) is currently the most widely used imaging modality for detection of a large vessel occlusion (LVO) in patients presenting with suspected acute ischemic stroke. For acute ischemic stroke due to LVO in the anterior circulation, endovascular treatment (EVT) is considered the most effective therapy.1 However, technical success and, more importantly, individual patient benefit are strongly dependent on the time between symptom onset and initiation of treatment.2 3 Fast and accurate detection of LVO on CTA can therefore contribute to the likelihood of a good clinical outcome.
In general, experienced (neuro)radiologists are well-capable of identifying LVOs on CTA, enabling prompt diagnosis of acute ischemic stroke due to LVO.4 5 Yet, such expertise is not always readily available, for instance in hospitals with lower caseloads and during off-hours when dedicated neuroradiologists are not on call. This may hamper fast and accurate CTA assessment.6 7 At the same time, the number of CTA examinations for suspected acute ischemic stroke is increasing due to optimization of stroke management and prolonged treatment windows.8 9
To support fast and accurate CTA assessment, diagnostic tools applying artificial intelligence algorithms are being developed. These tools are aimed at screening CTAs for LVOs and, in case of a positive finding, notifying not only local radiologists but also the stroke team at the nearest EVT-capable stroke center.10–14 Determining the performance of such algorithms is needed to estimate their potential clinical utility.
The aim of this study was to assess the diagnostic performance of an automated LVO detection algorithm in patients with and without anterior circulation LVO, and to assess the impact of scan acquisition parameters on performance.
Study design and patient selection
This study was performed in accordance with the STARD guidelines for reporting diagnostic accuracy.15 We used data from the first part of the Multicenter Randomized Clinical Trial of Endovascular Treatment for Acute Ischemic Stroke (MR CLEAN) Registry16 and from the Prehospital triage of patients with suspected stroke (PRESTO) study.17 The MR CLEAN Registry is a multicenter prospective registry including patients (n=1627) with acute ischemic stroke undergoing EVT from March 18, 2014 until June 15, 2016. PRESTO is a multicenter prospective observational cohort study including patients (n=1334) with suspected stroke in the ambulance from August 13, 2018 until September 2, 2019.
All patients who underwent baseline CTA were eligible for inclusion. Imaging parameters required for inclusion were: axial series; slice thickness 0.2–3 mm; slice increment equal to or smaller than slice thickness (ie, no excess z-spacing); matrix size of 512×512 or above; full head coverage. The evaluated algorithm was developed and trained only to detect intracranial internal carotid artery (ICA)/ICA terminus (ICA-T), M1, and M2 occlusions, but not isolated extracranial ICA, A1/A2, M3/M4, and posterior circulation (vertebral artery, basilar artery, or posterior (P1/P2) cerebral artery) occlusions. The latter group will be evaluated when implementing the current algorithm in a clinical setting. Therefore, we chose to include patients from our real-world PRESTO cohort with occlusions other than ICA, ICA-T, M1 or M2, but classified them as patients without LVO in order to assess whether they interfere with real-world diagnostic performance. CTA data that were used for algorithm training were not included in the current assessment of diagnostic performance. A complete overview of patient inclusion and exclusion criteria is outlined per cohort in online supplemental figure 1.
Reference LVO definition
CTAs were evaluated for the presence and location of LVO by imaging core labs consisting of 3 neuroradiologists and 10 interventional neuroradiologists (5–20 years of experience) who were blinded for algorithm output and all clinical data with the exception of the symptomatic side of stroke symptoms. The most proximal occlusion sites scored by core lab observers were defined as follows: extracranial ICA from the cervical segment to the clinoid segment; intracranial ICA from the clinoid segment to the ICA terminus; ICA terminus (ICA-T); M1-middle cerebral artery (MCA) from the ICA bifurcation to the MCA bifurcation; M2-MCA from the MCA bifurcation to where the vessels turn from the insula or exit the Sylvian fissure.18 Proximal occlusion sites used as reference location in this study included the intracranial ICA/ICA-T, M1-MCA, and M2-MCA. In patients with an extracranial ICA occlusion and concomitant intracranial tandem lesion, the most proximal intracranial occlusion site was taken as the reference location.
Automated LVO detection
The commercially available LVO detection algorithm (StrokeViewer v2.1.22, NICO.LAB, Amsterdam, The Netherlands) evaluated here is based on a deep learning convolutional neural network and runs within a web-based application hosted on a cloud platform. All CTA series were uploaded in Digital Imaging and Communications in Medicine (DICOM) format and processed separately. The algorithm indicated whether an occlusion was present via a binary output (ie, LVO detected: ‘Yes’ or ‘No’). In case of a positive LVO finding, an occlusion box was centered around the proximal occlusion site and shown using maximum intensity projection reconstructions in axial, coronal and sagittal views (figure 1). The threshold for detection of LVO was fixed at a single cut-off value by the developers of the algorithm and could not be adjusted.
Algorithm outcome and image quality assessment
All results generated by the algorithm were inspected by an independent observer who was blinded for all core lab imaging assessments and clinical data. In case of a positive LVO finding, the observer noted the hemisphere and the vessel segment (intracranial ICA/ICA-T, M1, or M2) on which the occlusion box was placed. Cases in which the occlusion box was not correctly placed (eg, in brain parenchyma or in the unaffected hemisphere) were classified separately (figure 2). Processing time was recorded as the time between receiving messages that the CTA series were successfully uploaded and receiving the results.
The CTA scan phase was classified into one of five phases using a previously described method for which interobserver agreement has also been determined (weighted ĸ 0.87).19 20 For the current study, scans were grouped into arterial (early arterial and peak arterial), equilibrium, or venous (peak venous and late venous) phase. Information on slice thickness, slice overlap, and peak kilovoltage was extracted from DICOM tags.
Diagnostic performance for correct detection of LVO and correct assessment of the exact occlusion location was evaluated within each cohort. Performance was reported by means of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and area under the curve (AUC) as appropriate. In order to assess the impact of image quality on detection of LVO, we pooled data from the MR CLEAN Registry and PRESTO, and reported diagnostic performance stratified by scan acquisition parameters. Performance per occlusion location stratified by scan acquisition parameters could only be reliably assessed in the MR CLEAN Registry due to the large sample of patients with LVO and heterogeneity of scan protocols used in this cohort. To allow comparison between performance of the current algorithm with those described in prior studies,12–14 we performed a sensitivity analysis in which we excluded patients with M2 occlusions and assessed performance for detection of LVO based on correct identification of the affected hemisphere but not exact occlusion location. Statistical differences in AUC were evaluated using DeLong’s method.21 Results are reported with corresponding 95% confidence intervals (95% CI). Statistical analyses were performed using R statistical software (version 3.6.1).
CTAs of 1110 patients in the MR CLEAN Registry (median age (IQR), 71 years (60–80); 584 men; 1110 with LVO) and of 646 patients in PRESTO (median age (IQR) 73 years (62–82); 358 men; 141 with and 505 without LVO) were successfully processed. Detailed patient and imaging characteristics are summarized per cohort in online supplemental table 1. Mean processing time of the algorithm was 4 min and 59 s (SD±1 min and 12 s).
Assessment of diagnostic performance was based on correct identification of the exact location of an LVO or the absence of LVO. In the MR CLEAN Registry, 992/1110 LVOs were correctly identified by the algorithm resulting in a sensitivity of 89% (95% CI 87% to 91%) (table 1). The algorithm incorrectly indicated absence of LVO in 46 patients, and in 72 patients the algorithm correctly indicated that LVO was present, but the occlusion box was incorrectly placed (online supplemental table 2). In PRESTO the algorithm correctly identified 102/141 patients with LVO and 392/505 patients without LVO. This resulted in a sensitivity of 72% (95% CI 64% to 80%), specificity of 78% (95% CI 74% to 81%), PPV of 47% (95% CI 41% to 54%), NPV of 91% (95% CI 88% to 93%), and AUC of 0.75 (95% CI 0.71 to 0.79) (table 1). The algorithm incorrectly indicated that LVO was absent in 29 patients with LVO and correctly indicated that LVO was present in 10 patients, but with incorrect placement of the occlusion box (online supplemental table 3). A total of 113 false positives were counted in patients without LVO, of which the majority were M2 occlusions (61/113, 54.0%) (online supplemental table 3).
In the sensitivity analysis, patients with M2 occlusions were excluded and correct identification of the affected hemisphere was used to assess diagnostic performance. By doing so, we report the performance for detection of ICA/ICA-T and M1 occlusion, and the correct detection of LVO by the algorithm was based on identifying the affected hemisphere but not the exact occlusion location. This resulted in a sensitivity of 96% (912/952; 95% CI 94% to 97%) in the MR CLEAN Registry, and a sensitivity of 93% (71/76; 95% CI 85% to 98%) and specificity of 78% (392/505; 95% CI 87% to 92%) in PRESTO.
Sensitivity per occlusion location
For ICA/ICA-T occlusion, the algorithm yielded a sensitivity of 88% (243/276; 95% CI 84% to 92%) in the MR CLEAN Registry and 80% (12/15; 95% CI 52% to 96%) in PRESTO (table 2). The highest detection rate was observed for M1 occlusion with a sensitivity of 94% (636/676; 95% CI 92% to 96%) in the MR CLEAN Registry and 95% (58/61; 95% CI 86% to 99%) in PRESTO. For M2 occlusion, a lower detection rate was observed than for other vessel segments and differed between the two study cohorts with a sensitivity of 72% (113/158; 95% CI 64% to 78%) in the MR CLEAN Registry and 49% (32/65; 95% CI 44% to 79%) in PRESTO. In patients who had an extracranial ICA occlusion with a concomitant intracranial tandem lesion, the algorithm correctly detected 35/40 (87.5%) intracranial lesions.
Impact of scan acquisition parameters on performance
Slice thickness of ≥2 mm had a negative impact on diagnostic performance of the algorithm compared with <1 mm (AUC 0.71 vs 0.83, p<0.01) and 1–2 mm scans (AUC 0.71 vs 0.85, p<0.01) (online supplemental table 4). Lower diagnostic performance was also observed for the venous scan phase compared with equilibrium (AUC 0.75 vs 0.87, p=0.02), but not compared with the arterial scan phase (AUC 0.75 vs 0.82, p=0.14). Sensitivity per occlusion location within different subgroups was only evaluated within the MR CLEAN Registry. This revealed that increasing slice thickness, no slice overlap, and later scan phase resulted in a lower sensitivity for detection of M2 occlusion but not for detection of ICA/ICA-T and M1 occlusion (online supplemental table 5).
This study evaluated the diagnostic performance of an automated LVO detection algorithm based on deep learning in a large cohort of patients with and without LVO, demonstrating an overall high performance for the detection of intracranial LVOs. Differences in detection rate were seen between occlusion sites and based on image acquisition parameters.
Studies on the diagnostic performance of human readers generally show a high detection rate for occlusions in the ICA/ICA-T and M1 segments, with sensitivities ranging from 89% to 97%,5 22 which is comparable to the sensitivity found here. Human diagnostic error for more distal, in particular M2 occlusions, is notably higher with a reported sensitivity of only 65% in one study,7 similar to the sensitivity of local radiologists in PRESTO.23 This indicates a large potential for improvement of detection of M2 occlusion. For the algorithm evaluated here, we found a clear difference in detection of M2 occlusion between both cohorts. This was most likely the result of the selection of the MR CLEAN Registry population, where all occlusions, including M2 occlusions, were already identified by human readers and where patients were referred for EVT. In contrast, PRESTO represents a real-world stroke cohort including patients prior to imaging assessment and reflects the distribution of LVOs as encountered in daily clinical practice. As a consequence a broader spectrum of M2 occlusions is included in PRESTO, even those more difficult to detect for human readers. This makes it a more suitable target population for evaluating the diagnostic performance of the algorithm in a real-world setting.24 The sensitivity of the algorithm for detection of M2 occlusion in PRESTO was lower than that of human readers.
The algorithm also provided a lower specificity than human readers (78% vs 86–97%).5 22 When evaluating the diagnostic performance of LVO detection algorithms, however, it is important to put performance measures into a clinical context and thereby also consider the prior probability of LVO in patients undergoing CTA due to suspected acute ischemic stroke.25 For LVO detection, a false positive result means the radiologist and stroke team wrongfully receive an alert of a potential LVO finding on CTA prompting fast imaging assessment. A false negative result wrongfully indicates no LVO is present, potentially providing false reassurance and delaying further CTA evaluation by a radiologist. While false positives may be a nuisance for clinicians, false negatives may delay initiation of treatment and possibly be harmful for patients. Efforts should therefore be aimed at achieving a high sensitivity for detecting LVOs along with an acceptable specificity. On the other hand, previous studies including PRESTO have shown that the prior probability of anterior circulation LVO on CTA in patients with suspected acute ischemic stroke is relatively low and lies within the range of 16–21%.23 26 This means that, despite the specificity of 78% of the current algorithm, true positives will occur just as frequently as false positives when implementing this algorithm in a real-world setting, as indicated by the PPV of 47% in PRESTO.
An elegant feature of the current algorithm mitigating this issue is placement of a box around the exact location where it detects an occlusion. This direct detection method allows inspection of what triggered the algorithm to come to its decision, providing users with transparency and directing them to (pathological) features that led to the output.27 By doing so, users can quickly distinguish true positive from false positive results. Other algorithms notify users in case of a positive LVO finding and provide more indirect information (eg, brain regions with reduced vessel density) on how the decision was reached.10 13 28 The current algorithm thus has the potential to aid in locating the exact occlusion site. This can be especially useful for less experienced readers and possibly aid in the early detection of LVO, thereby also accelerating diagnosis. It further allows remote access to output both at the primary stroke center and also the nearest EVT-capable intervention center. This may help to expedite decision-making about EVT and enrollment in clinical trials. Such algorithms thus hold the potential to increase patient benefit of EVT as the treatment of LVO is known to be highly time sensitive.3
Recent studies have reported performance metrics of other commercially available LVO detection algorithms. For detection of LVO, a sensitivity of 96% and specificity of 98% have been reported for the RAPID-LVO,12 a sensitivity of 82% and specificity of 90% for Viz LVO,13 and a sensitivity of 84% and specificity of 96% for e-CTA.14 However, direct comparisons of performance of these algorithms with the current algorithm are difficult to make due to discrepancies in study design. Studies evaluating RAPID-LVO and Viz LVO excluded M2 occlusions in their analyses, for which it has been shown that these algorithms yield lower detection rates.13 28 In addition, diagnostic performance was based on either the presence or absence of LVO with12 14 or without13 correct identification of the affected hemisphere, whereas we assessed performance based on correct identification of the exact location of LVO or the absence of LVO. Not including M2 occlusion as LVO and assessment of performance not based on the exact location of the occlusion leads to higher detection rates of LVO as shown in our sensitivity analysis. Other factors contributing to differences in performance are varying inclusion criteria and patient populations. Some studies used curated datasets12 14 and others a real-world stroke population.13 This may lead to differences in the distribution of LVO locations and, because of varying detection rates by occlusion location, overall performance measures. As demonstrated in the current study, the sensitivity of the algorithm for detection of LVO was considerably higher in the MR CLEAN Registry compared with PRESTO, mainly due to the higher proportion and broader spectrum of M2 occlusions in the latter cohort.
However, diagnostic performance of LVO detection algorithms should preferably be assessed in a real-world stroke population such as PRESTO as it provides a more reliable estimation of the potential of the algorithm in a clinical setting. Nevertheless, benefits of using the MR CLEAN Registry here was that CTAs were acquired with a variety of acquisition protocols. This allowed us to show that image acquisition parameters such as slice thickness and CTA scan phase significantly impact algorithm performance, and that high-quality input data are a prerequisite for adequate diagnostic performance. This was most evident for the detection of M2 occlusions, likely due to the smaller caliber, branching pattern, and tortuosity of these vessels, making vascular segmentation more susceptible to errors. Especially for M2 occlusions, it is possible that other acquisition schemes such as multiphase CTA lead to better detection by automated algorithms,10 as is seen for M2 occlusion detection by human readers.29
The strengths of this study include the large sample size of LVOs, allowing us to assess diagnostic performance both for overall detection of an LVO and per individual occlusion location with sufficient precision. By including CTAs from a variety of hospitals (>50) using different acquisition protocols, we were able to investigate the impact of scan acquisition parameters on performance. Also, the current evaluation was conducted independently of commercial developers and their affiliates. A limitation of this study is that the evaluation was carried out retrospectively and we were therefore not able to assess the impact of the current LVO detection algorithm on decision-making and treatment parameters.30 Also, we were not able to reliably compare performance of the current algorithm to those described by others mainly due to the use of different test sets. If feasible, head-to-head comparisons of different algorithms within the same test set will ultimately allow for more unbiased and reliable comparisons.
The algorithm we evaluated here has a high sensitivity for the detection of proximal anterior circulation LVOs (ICA/ICA-T and M1) on CTA. The sensitivity for M2 occlusion is lower than human assessment in a real-world setting and future efforts should specifically target improvement of M2 occlusion detection. Together with the lower specificity of the algorithm than human readers, critical CTA evaluation by radiologists remains crucial irrespective of algorithm output.
Data availability statement
No data are available.
The Institutional Review Board of the Erasmus MC University Medical Center evaluated the MR CLEAN Registry study protocol and granted permission to carry out the study as a registry (MEC-2014–235), and approved PRESTO (MEC-2018–1012). Necessity of written informed consent was waived for both the MR CLEAN Registry and PRESTO.
The authors thank NICO.LAB (Amsterdam, The Netherlands) and, in particular, Razmara Nizak, Renan Sales Barros, and Merel Boers for providing access to the LVO detection algorithm and their technical assistance throughout the course of this study.
SPRL and LW contributed equally.
Collaborators MR CLEAN Registry and PRESTO Investigators: Robert J van Oostenbrugge; Jelis Boiten; Charles B.L.M. Majoie; Yvo BWEM Roos; Jan Albert Vos; Ivo GH Jansen; Maxim JHL Mulder; Robert-Jan B Goldhoorn; Kars CJ Compagne; Manon Kappelhof; Wouter J Schonewille; Jonathan M Coutinho; Marieke JH Wermer; Marianne AA van Walderveen; Julie Staals; Jasper M Martens; Bart J Emmer; Sebastiaan F de Bruijn; Lukas C van Dijk; Bart van der Worp; Rob H Lo; Ewoud J van Dijk; Hieronymus D Boogaarts; Paul LM de Kort; Julia van Tuijl; Jo JP Peluso; Jan SP van den Berg; Boudewijn AAM van Hasselt; Leo AM Aerden; René J Dallinga; Maarten Uyttenboogaart; Omid Eshghi; Tobien HCML Schreuder; Roel JJ Heijboer; Koos Keizer; Heleen M den Hertog; Emiel JC Sturm; Marieke ES Sprengers; Sjoerd FM Jenniskens; René van den Berg; Albert J Yoo; Ludo FM Beenen; Alida A Postma; Stefan D Roosendaal; Bas FW van der Kallen; Ido R van den Wijngaard; Joost Bot, Pieter-Jan van Doormaal; Zwenneke Flach; Hester F Lingsma; Naziha el Ghannouti; Martin Sterrenberg; Corina Puppels; Wilma Pellikaan; Rita Sprengers; Marjan Elfrink; Joke de Meris;Tamara Vermeulen; Annet Geerlings; Gina van Vemde; Tiny Simons; Cathelijn van Rijswijk; Gert Messchendorp; Hester Bongenaar; Karin Bodde; Sandra Kleijn; Jasmijn Lodico; Hanneke Droste; M Wollaert; D Jeurrissen; Ernas Bos; Yvonne Drabbe; Nicoline Aaldering; Berber Zweedijk; Mostafa Khalilzada; Esmee Venema; Vicky Chalos; Ralph R Geuskens; Tim van Straaten; Saliha Ergezen; Roger RM Harmsma; Daan Muijres; Anouk de Jong; Wouter Hinsenveld; Olvert A Berkhemer; J Huguet; PFC Groot; Marieke A Mens; Katinka R Kranendonk; Kilian M Treurniet; Manon L Tolhuijsen; Heitor Alves; Anouk D Rozeman; Frédérique H. Vermeij; Kees CL Alblas; Laus JMM Mulder; Annemarie D Wijnhoud; Lisette Maasland; Roeland PJ van Eijkelenburg; Marileen Biekart; ML Willeboer; Bianca Buijck; Jeannette Bakker; Jan-Hein Hensen, Aarnout Plaisier; Amber Hoek; Erick Oskam; Mandy MA van der Zon; Egon D Zwets; Jan Willem Kuiper; Bruno JM van Moll; Mirjam Woudenberg; Arnoud M de Leeuw; Anja Noordam-Reijm; Timo Bevelander; Vicky Chalos; Eveline JA Wiegers; Dennis C van Kalkeren; Jochem van den Biggelaar
Contributors SPRL, LW and AvdL were responsible for study concept and design. SPRL, LW, AvdL, MHCD, PJvD, WM, HK, GJLaN, RPHB, LSFY, JH, WHvZ, ACGMvE, DWJD, BR were responsible for or contributed to data acquisition. SPRL, LW and AvdL were responsible for analysis and interpretation of the data and drafting the manuscript. MHCD, PJvD, WM, HK, GJLaN, RPHB, LSFY, JH, WHvZ, ACGMvE, DWJD, BR were responsible for critical revision of the manuscript.
Funding The MR CLEAN Registry was partly funded by TWIN Foundation, Erasmus MC University Medical Center, Maastricht University Medical Center, and Amsterdam University Medical Center. PRESTO was funded by BeterKeten Collaboration and Theia Foundation (Zilveren Kruis).
Competing interests WHvZ reports grants from Stryker and Cerenovus, all paid to the institution. DWJD reports funding from the Dutch Heart Foundation, Brain Foundation Netherlands, The Netherlands Organisation for Health Research and Development, Health Holland Top Sector Life Sciences and Health, and unrestricted grants from Penumbra, Stryker, Medtronic, Thrombolytic Science, LLC, and Cerenovus, all paid to the institution. AvdL reports grants from Penumbra, Stryker, Cerenovus, and Medtronic, all paid to the institution.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.