Article Text

Download PDFPDF

Original research
Fully automated intracranial aneurysm detection and segmentation from digital subtraction angiography series using an end-to-end spatiotemporal deep neural network
  1. Hailan Jin1,
  2. Jiewen Geng2,3,
  3. Yin Yin1,
  4. Minghui Hu1,
  5. Guangming Yang1,
  6. Sishi Xiang2,3,
  7. Xiaodong Zhai2,3,
  8. Zhe Ji2,3,
  9. Xinxin Fan4,
  10. Peng Hu2,3,
  11. Chuan He2,3,
  12. Lan Qin1,
  13. Hongqi Zhang2,3
  1. 1 Department of R&D, UnionStrong (Beijing) Technology Co.Ltd, Beijing, China
  2. 2 China International Neuroscience Institute (China-INI), Beijing, China
  3. 3 Department of Neurosurgery, Xuanwu Hospital,Capital Medical University, Beijing, China
  4. 4 Department of Neurosurgery, Xi'an NO.3 Hospital, the Affiliated Hospital of Northwest University, Xi'an, Shanxi Province, China
  1. Correspondence to Dr Hongqi Zhang, Department of Neurosurgery, Xuanwu Hospital, Beijing 100176, China; xwzhanghq{at}163.com; Dr Lan Qin; qinlan{at}unionstrongtech.com

Abstract

Background Intracranial aneurysms (IAs) are common in the population and may cause death.

Objective To develop a new fully automated detection and segmentation deep neural network based framework to assist neurologists in evaluating and contouring intracranial aneurysms from 2D+time digital subtraction angiography (DSA) sequences during diagnosis.

Methods The network structure is based on a general U-shaped design for medical image segmentation and detection. The network includes a fully convolutional technique to detect aneurysms in high-resolution DSA frames. In addition, a bidirectional convolutional long short-term memory module is introduced at each level of the network to capture the change in contrast medium flow across the 2D DSA frames. The resulting network incorporates both spatial and temporal information from DSA sequences and can be trained end-to-end. Furthermore, deep supervision was implemented to help the network converge. The proposed network structure was trained with 2269 DSA sequences from 347 patients with IAs. After that, the system was evaluated on a blind test set with 947 DSA sequences from 146 patients.

Results Of the 354 aneurysms, 316 (89.3%) were successfully detected, corresponding to a patient level sensitivity of 97.7% at an average false positive number of 3.77 per sequence. The system runs for less than one second per sequence with an average dice coefficient score of 0.533.

Conclusions This deep neural network assists in successfully detecting and segmenting aneurysms from 2D DSA sequences, and can be used in clinical practice.

  • aneurysm
  • angiography
  • technique

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Intracranial aneurysms (IAs) are abnormal focal dilatations in the vessel wall of cerebral arteries, which result from a weakness in the intima. Rupturing of an IA causes subarachnoid hemorrhage, which can be life-threatening. IAs are relatively common in the population, with an incidence of >3% of the adult population worldwide and of 7% in the Chinese population.1 2 The annual rate of rupture of unruptured intracranial aneurysms (UIAs) is about 0.95% according to the Unruptured Cerebral Aneurysm Study (UCAS).3 According to one study, the annual rate of UIA rupture ranges between 0.4% and 17.8%, based on different patient characteristics.4 So, the detection of IA is considered important.

Digital subtraction angiography (DSA) has long been considered the 'gold standard' for diagnosing IA. DSA is recommended for identification and evaluation of IA when surgery or endovascular treatment is considered.1 Compared with computed tomographic angiography (CTA) and magnetic resonance angiography (MRA), DSA provides higher resolution and detection sensitivity, especially for IAs smaller than 3 mm.1

For patients with subarachnoid hemorrhage, it is necessary to obtain two-dimensional DSA (2D DSA) of anterior and posterior (AP) views and a lateral view of the four intracranial arteries (left and right common carotid artery and left and right vertebral artery). To reconstruct the entire artery, three-dimensional DSA (3D DSA) should be performed immediately if an IA is found on the artery. The reconstructed 3D DSA provides more information to assist the choice of treatment.

A 2D DSA image sequence is recorded in DICOM format to document the entire flow of a contrast agent through an artery, which usually lasts for 3–15 s, and the sampling rate is 3–5 frames per second. During imaging, the patient's head should be in stable position to enable a clear imaging sequence to be obtained. The doctors should check the whole 2D DSA sequence to identify IAs. The inspection process may take longer than expected, and errors can occur owing to the location and shape of the IA. Small IAs are sometimes misdiagnosed. Hence, in this paper, a method based on the structure of deep neural networks was proposed to solve the problem of IA detection and segmentation in 2D+time DSA sequences. This, in turn, helps doctors to determine the presence of an IA in the DSA sequence more rapidly and in a progressive manner.

Methods

Dataset

This retrospective study obtained approval from the institutional review board, and informed consent was waived in compliance with the Accountability Act. Inclusion criteria included adult patients (aged between 18 and 90 years) with an aneurysm who were scheduled for treatment according to their imaging data obtained in hospital; aneurysms present after treatment were not included in our study. We retrospectively collected 3216 DSA sequences from 493 consecutive patients imaged by GE, Philips, and Siemens DSA devices in between April 2016 and April 2018. These patients were divided into a training set, validation set, and test set as shown in figure 1A. Each patient contains DSA sequences of AP and lateral views from at least one cerebral vessel. All patient-related information was anonymized from the DICOM images. If an IA was visible, then two experienced neurologists outlined its boundary on the DSA frame where it was most visible and discussed ways of unifying the result by maintaining these contours after labeling. Occasionally, a third neurologist made a decision when the first two doctors did not agree. A total of 1205 IAs were labeled. The average diameter of the contoured IAs was 7.4 mm with the largest being 40 mm and the smallest 1.3 mm. The average number of frames for DSA sequences was 25. All the frames were rescaled to 512×512 with pixel spacing 0.391 mm, forming the field of view 200×200 mm 2 . Only 16 frames around the central frame were used, because the frame with the most visible IA is unlikely to be found at the beginning or end of the DSA sequence.

Figure 1

(A) Study inclusion flowchart schema. (B) Design of the proposed end-to-end network for automated IA detection and segmentation. The network inputs a n-frame DSA sequence and outputs a mask with the same frame size, indicating the locations and shapes of the detected and segmented IAs within the input DSA sequences. (C) Implementation of deep supervision. in addition to the final mask output from the top convolution block The two blocks below the top output block are also upsampled to compare with the ground truth mask and contribute to the target loss. This technique provides deeper supervision of the network during training. BiConv LSTM, bidirectional convolutional long short-term memory; IA, intracranial aneurysm.

Before input to the network, the pixel intensities of the DSA sequence were normalized locally by making the local mean 0 and SD 1. Next, 347 patients with 2269 DSA sequences were used to train and validate the proposed network and the accuracy test was performed on the remaining 146 patients with 947 DSA sequences. When training the network, the input frames and the corresponding ground truth masks were cropped to 256×256. When testing on the test set, we input the original 512×512 frames and the network run in a fully convolutional mode.

Machine learning

The proposed end-to-end network structure is presented in figure 1B. The network architecture was created by Keras-2.2.0 with TensorFlow-1.4.0 backend.5 The network inputs the entire DSA sequence and outputs a 2D mask image with the same frame size, indicating the position and shape of the detected and segmented IAs in the input DSA sequence.

The network includes four key components: (a) a U-shaped fully convolutional structure for image detection and segmentation, (b) convolutional long short-term memory (ConvLSTM) modules to learn both spatial and temporal information across the DSA frames, (c) a bidirectional design of ConvLSTM to incorporate information from different sequence input orders, and (d) a deep supervision design to help the model converge.

U-shaped fully convolutional structure

The U-shaped design of the network structure is similar to the popular U-Net in medical image detection and segmentation.6 On one side of the network, the input 256×256 image is encoded to 8×8×512 features through six convolution blocks and five downsampling layers. On the other side of the network, the encoded features are decoded to a 256×256 binary mask through another six convolution blocks and five upsampling layers. If there is an IA region, then the target value of the binary mask pixel remains 1, otherwise it is 0. The encoder and decoder of our network are symmetric with the same number of convolution layers at each level.

The idea of fully convolutional network is to replace fully connected layers with equivalent convolution layers,7 so that the network can extend to work on arbitrarily sized input images. The network we proposed has convolution blocks, each containing two convolution layers with 3×3 convolution kernels. The downsampling and upsampling layers are operations that are independent to the input feature map size. In our network, all convolution layers are followed by the group normalization layers and the squeeze and excitation layers to further assist model convergence and boost model performance.8 9

Convolutional long short-term memory module

ConvLSTM is a variant of the traditional long short-term memory (LSTM).10 Instead of using fully connected layers to connect the input and hidden states densely, ConvLSTM applies to local convolutions. The advantage is that ConvLSTM involves fewer parameters than LSTM, especially when the input is a high-dimensional image. Also, with the replacement of fully connected layers with convolution layers, ConvLSTM works in a fully convolutional way. The information extracted from the frames is stored in the hidden state.

Bidirectional design

Note that in ConvLSTM the computed hidden states containing temporal information depend on the order of input frames, and reading of a DSA sequence may be improved by reading DSA frames in different orders. We concatenate the last hidden state Embedded Image at the T-1 frame of the ConvLSTM in the frame order 0 to T− 1 (the forward order hidden state) and the last hidden state Embedded Image in the frame order T−1 to 0 (the backward order hidden state), where T−1 is the last time step of a DSA sequence. The connected hidden state Embedded Image is output to the decoder, so that BiConvLSTM incorporates temporal information in both directions of DSA frames, mimicking the neurologists’ real examination process by examining DSA sequences in both directions.

Deep supervision

Deep supervision is another useful training technique to improve the model performance as reported in liver segmentation.11 Our deep supervision structure during training is shown in figure 1C. Instead of directly optimizing the loss from the final output mask, two additional output masks are generated by upsampling the outputs of lower convolution blocks. The loss to optimize is a combination of the losses from all the three output masks when compared with the ground truth mask. In this experiment, 1.0, 0.5, and 0.2 were used as weighting parameters to combine the losses from the top output convolution block and the blocks one and two levels below, respectively.

Loss function

The network was optimized with dice coefficient (DSC)-based losses. We multiplied the imbalanced UIA sample weights by a factor of 100 in the DSC loss during training, similar to the step in ASDNet.12 We also used the idea of 'focal loss' with the modulating exponential factor of 2 to deal with difficult samples based on the samples’ output DSC.13

The network was run on a NVIDIA Tesla P100 GPU. The model was trained using the Adam optimizer with an initial learning rate of 1e-4. L2 regularization was applied with a weight decay factor of 1e-4 to all convolution kernel weight matrices. The training set was used to train the parameters of the proposed network and the network with the best performance on the validation set that was used for testing. The blind test set was withheld and only used at the end for evaluation of the model performance.

Quantitative analysis

An accuracy test was performed on the remaining 146 patients with 947 DSA sequences, and the test operators were blinded to previous labeling. In our case, every pixel value of the segmentation mask represents the probability that the pixel belongs to an aneurysm. Given a small threshold (0.01 in our case), the segmentation masks on the blind test set were transformed to binary masks. The regions of interest (ROIs) were extracted from the binary masks. The confidence of each localized ROI was defined as the average probability value of the segmentation mask within the ROI. If an ROI overlapped a manually labeled aneurysm, the ROI was marked as a true positive (TP), and otherwise as a false positive (FP). The free-response receiver operating characteristic (FROC) curve was plotted at the ROI level by calculating the detection sensitivities of all labeled IAs and the corresponding average number of FPs for each DSA view at all thresholds.14 The FROCs at vessel and patient level were also reported. For vessel sensitivity, an artery is a TP if it contains IAs and at least one IA on that artery is detected in one or more 2D DSA views. Similarly, a patient is a TP if that patient contains IAs and at least one IA is detected. The agreement analysis between the predicted masks and the hand-contoured ground truth labels was performed using the DSC and overlap ratio metrics on the detected aneurysms.

Results

Characteristics of dataset

Of the 347 patients in the model development set, 135 were men and 212 were women, with a mean age of 54.9±11.9 years. Overall, 268 patients underwent DSA on a GE machine, 61 patients with Philips medical Systems, and 18 with a Siemens Healthcare unit. Of the 851 aneurysms, 588 of these were located on the sidewall of an artery and 263 were bifurcation aneurysms. Most of these aneurysms were less than 5 mm in diameter. The diameter here refers to the maximum distance from the aneurysmal dome to the neck plane. Another 12 patients had non-saccular intracranial aneurysms, including dissecting aneurysms or fusiform aneurysms, and no distinction was made here.

The test dataset comprised a total of 146 patients (51 male, 95 female), with a mean age of 55.2±12.4 years. Similar to the model development, most of the patients underwent DSA examinations on GE machines. Of the 354 aneurysms, 236 were located on the sidewall and the remaining 118 were located on the bifurcation. Nearly half of the aneurysms were less than 5 mm in diameter, and of the remainder, most ranged between 5 mm and 15 mm. Giant aneurysms and non-saccular intracranial aneurysms accounted for only a small proportion (table 1).

Table 1

Dataset characteristics

Diagnostic accuracy

Different levels of FROCs are shown in figure 2. The system achieved 89.3% sensitivity with bootstrapped 95% confidence intervals (CI) from 83.2% to 95.4%, and 316 of 354 IAs successfully detected from 146 patients, giving a FP result of 3.77 per DSA view on the blind test set. At the same FP number, the model achieved a vessel sensitivity of 94.3% (95% CI 89.8% to 98.8%), detecting at least one IA on each of 165 blood arteries from a total of 175 arteries with IAs. The patient level sensitivity was 97.7% (95% CI 94.7% to 100.0%), which means that at least one IA on each of 125 patients from a total of 128 patients was detected. Considering only the small IAs (3 mm) in the testing cohort, the ROI, vessel and patient level sensitivities were 74.4% (95% CI 65.9% to 83.0%), 74.1% (95% CI 65.5% to 82.7%), and 76.0% (95% CI 67.5% to 84.4%) respectively. Considering only the large IAs (>3 mm), the ROI, vessel and patient level sensitivities were 91.1% (95% CI 85.6% to 96.7%), 96.2% (95% CI 92.4% to 99.9%), and 98.3% (95% CI 95.7% to 100.0%) at the same FP number.

Figure 2

(A) ROI level FROC of the network detection output on the test set. (B) Vessel level FROC of the network detection output on the test set. (C) Patient level FROC of the network detection output on the test set. FROC, free-response receiver operating characteristic; ROI, region of interest.

Agreement analysis

Qualitatively, the segmented masks for the aneurysms had notable overlap with the manual aneurysm labels. We achieved a mean DSC of detected IAs of 0.533 and a mean overlap ratio of detected IAs of 0.875.

Ablation study

The bidirectional and deep supervision designs were tested for their impact using ablation studies. Three different models with the same number of layers, number of hidden states, and number of kernels in each layer were trained. Model 1 had the vanilla ConvLSTM with forward order hidden states on the U-Net structure; model 2 was a BiConvLSTM model that added backward order states concatenated to the forward order hidden states of model 1; and model 3 implemented deep supervision on top of model 2. All three models were evaluated on the validation set with respect to DSC. As a result, model 1 achieved DSC 0.355. Model 2 using BiConvLSTM contributed to the DSC improvement of 0.045 or 12.7%. Finally, the use of deep supervision on model 3 further improved DSC by 0.028 or 7.2% over model 2.

Discussion

Neural networks are regarded as a subfield of machine learning, and are new technologies in the field of artificial intelligence. A human-like algorithm is developed based on the structure and function of the brain. It has been widely used in medical image recognition, and was first used for intelligent diagnosis of lung imaging and bone imaging.15 16 More and more aneurysm detection and segmentation procedures based on a deep learning technique have been reported recently, including abdominal aortic aneurysm detection and segmentation on CTA images using a conventional deep convolutional neural network approach,17 and cerebral aneurysm detection on MRA.18 19 With advances in technology, some neural network algorithms for DSA diagnosis or segmentation of aneurysms have been suggested.20–23 However, to the best of our knowledge, algorithms that can combine two functions, and be validated on a large DSA dataset, are seldom successful.

In this paper we built a large cerebral aneurysm DSA dataset and tested a new deep learning based framework to detect and segment IAs in 2D+time DSA sequences using this dataset. The proposed framework used a network structure to incorporate spatiotemporal information similar to the procedures in semantic video segmentation—for example, the deep spatiotemporal fully convolutional networks, which leverage both spatial and temporal dependencies and can be trained in an end-to-end manner.24 ,25 . Applications are also found in the field of healthcare—for example, the encoder–decoder architecture presented in the cardiovascular field, which can obtain semantic task-aware representation and preserve fine-grained information. 26 . In comparison with these procedures, we enhanced the structure with bidirectional and deep supervision designs, and the effect of these designs was evaluated using ablation studies. Furthermore, as an extension of our previous work, the proposed network can easily deal with the problems of detection and segmentation of high-resolution temporal image sequences.27

The proposed network can be trained end-to-end. A set of popular and contemporary optimization techniques to train the model are used to improve the model's performance. The techniques include group normalization, sample balancing, and focal DSC loss. The design of the network is general and can be applied to similar problems in medical imaging detection and segmentation.

The running time of the entire detection and segmentation process is less than one second per sequence. The results are promising for practical use in real-time diagnosis of cerebral hemorrhage and treatment of IAs. Some good examples of detection and segmentation from the final model output are presented in figure 3A. From these examples, the model shows the ability to detect and segment IAs with different shapes, sizes, and locations. Meanwhile, some detection errors are shown in figure 3B. The algorithm mistakes normal blood vessel curling, folding, or overlapping as aneurysms—for example, the internal carotid siphon and the posterior cerebral artery overlapping above the basilar artery tip in figure 3B.

Figure 3

Examples of detection and segmentation. The red curves represent the ground truths labeled by neurologists, and the green curves represent the automated segmentation. The images at the top are the intracranial aneurysm regions from the original DSA sequences. The images at the bottom are the corresponding truth labels and detection results. (A) Examples of good detection and segmentation. (B) Examples of detection errors.

A previous study reported an ROI sensitivity of 89.47% with four FPs.28 Our system has similar ROI sensitivity but fewer FPs. In addition, the performance of our system was evaluated on images from 149 patients obtained by multiple DSA devices in comparison with 19 patients reported in the literature. Although the algorithm still mistakes bending and folding as aneurysms in some blood vessels, these can be easily ruled out by physicians, without affecting clinical use. There is a reasonable agreement between the segmented masks for the aneurysms and the manual aneurysm labels. For some cases, the model mistakes part of the parent artery for an aneurysm, and leads to a low DSC, which is more severe for small aneurysms. A typical case can be seen in the middle of figure 3B.

2D DSA is still regarded as an important tool in clinical treatment planning, and is used to identify aneurysms and stenosis when people have initial symptoms. In many cases, 3D DSA is performed when 2D DSA shows the presence of aneurysms. Our technique will benefit neurologists with a more accurate and faster aneurysm detection in 2D DSA. Furthermore, our technology can be easily extended to automatic aneurysm detection and segmentation in 3D DSA to help clinicians work more conveniently and accurately. MRA also plays an important role in the detection of aneurysms in the population. The technology of automatic detection presented in this article can also be extended to MRA aneurysm detection tasks. The algorithm framework of this study provides great support for such research.

However, our system has some limitations. Owing to highly overlapped tissues, the detection and segmentation of small IAs remains difficult, making an average output DSC score relatively low. Increasing the detection sensitivity for small IAs and reducing the number of FPs is our next research target.

Conclusions

We successfully implemented automatic aneurysmal detection and segmentation on 2D DSA through deep neural network methods. This technique is useful in clinical practice, providing physicians with suggestions after the patient undergoes DSA, and could save time and reduce errors.

References

Footnotes

  • HJ and JG contributed equally.

  • Contributors HJ, JG, and HZ conceived and designed the research. HJ, JG, YY, MH, GY, SX, XZ, ZJ, and XF collected and reviewed the data. HJ and JG analyzed the data and performed the statistical analysis. PH, CH, HZ, and LQ handled funding and supervision. HJ and JG drafted the manuscript. All authors made critical revisions of the manuscript and reviewed the final version.

  • Funding This work was supported by the National Key Research Development Program grant number 2016YFC1300800 and National Natural Science Foundation of China grant number 81500988.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval This study was approved by our institutional ethics committee (Xuanwu Hospital, No.2017082).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data are available upon reasonable request. Because of the sensitive nature of the data, it is available upon request to the corresponding author.