J Stroke Search

CLOSE


J Stroke > Volume 27(3); 2025 > Article
Kim, Kim, Lee, Kim, Jeong, Lee, Kim, Han, Choi, Shin, Kim, Park, Kang, Kim, Lee, Oh, Yu, Lee, Park, Hong, Cho, Choi, Sohn, Hong, Park, Kwon, Kim, Lee, Ryu, and Bae: Deep Learning-Based Automatic Classification of Stroke Size in Patients With Atrial Fibrillation
Dear Sir:
The optimal timing for initiating direct oral anticoagulant (DOAC) in patients with stroke and atrial fibrillation (AF) remains uncertain, largely due to the varying risk of hemorrhagic transformation. Previous studies suggest that the risk of hemorrhagic transformation is related to stroke severity, using the National Institutes of Health Stroke Scale (NIHSS) score [1,2]. However, the NIHSS score reflects not only the size of the infarct but also its location, potentially leading to discrepancies between the score and the actual burden of infarct volume. Moreover, hemorrhagic transformation after ischemic stroke is closely related to the size of stroke [3]. Based on these findings, the Early versus Late Initiation of Direct Oral Anticoagulants in Post-ischemic Stroke Patients with Atrial Fibrillation (ELAN) trial stratified stroke size into minor, moderate, and major categories based on imaging, showing that early initiation of DOACs is likely safe and may reduce the risk of recurrent ischemic events [4].
In real-world practice, a lack of certified stroke centers and consistent access to in-person neurological consultations pose significant challenges [5]. Consequently, many institutions face challenges in maintaining sufficient expertise to reliably classify stroke size based on imaging criteria. Furthermore, imaging-based risk stratification in large-scale clinical trials is often limited by the requirement for precise stroke-volume assessment and the availability of multiple neurologists. To address these needs, we developed a deep learning algorithm that automatically classifies stroke sizes based on the imaging criteria, using diffusion-weighted imaging (DWI) data of stroke patients with AF.
The algorithm was trained on 1,091 DWI scans of ischemic stroke attributable to AF, collected from four hospitals between 2011 and 2021. An external validation dataset comprising 1,265 DWI scans was collected from 11 non-overlapping hospitals between 2017 and 2020 (Supplementary Methods and Supplementary Figure 1). The institutional review boards of all centers approved the study, and written informed consent was obtained.
For the training/internal validation dataset, stroke size classification was determined by an experienced vascular neurologist (WSR) using a standardized visual rating scheme from the ELAN trial (Supplementary Methods) [4]. For the external validation dataset, stroke size classification was independently determined by two vascular neurologists (WSR and HK) using the same criteria. In cases of disagreement, a consensus was reached and used as the ground truth for external validation. Stroke locations in the external validation dataset were classified into supratentorial, infratentorial, and mixed lesions (Supplementary Methods).
Infarct lesions on DWIs were automatically segmented using a validated 3D U-Net algorithm (JLK-DWI; JLK Inc., Seoul, Korea) [6]. For the classification, we employed an EfficientNet3D. The model was modified to accept two-channel inputs (DWI and segmentation mask) and output three classes representing stroke size classification. Additional details for model development are provided in Supplementary Methods and Supplementary Figure 2.
The algorithm’s performance was compared with expert consensus using percentage agreement, Cohen’s kappa, and area under the receiver operating characteristic curve (AUC). Inter-rater percentage agreement and Cohen’s kappa were also calculated, comparing classifications made by two vascular neurologists. Details for statistical analysis are provided in Supplementary Methods.
The mean (SD) ages for the internal and external datasets were 73.6 (10.3) years and 75.2 (10.2) years, respectively, with 54.4% and 53.0% of participants being male (Supplementary Tables 1 and 2). In the external validation dataset, the percentage agreement and Cohen’s kappa between the deep learning algorithm and the consensus of two vascular neurologists were 87.4% (95% confidence interval [CI], 85.4-89.2) and 0.81 (95% CI, 0.78-0.84), respectively, with comparable performance in the training/internal validation dataset and the algorithm demonstrated notable accuracy for each stroke size classifications (Table 1). The AUC values for classifying minor, moderate, and major stroke categories in the external validation dataset were 0.988, 0.955, and 0.988, respectively, with similar performance in the training/internal validation dataset (Figure 1). In comparison, between the vascular neurologists, the percentage agreement was 74.6% with Cohen’s kappa of 0.62 (Supplementary Table 3).
After stratifying by stroke location, the deep learning algorithm showed high agreement with stroke experts for supratentorial and infratentorial lesions, achieving Cohen’s kappa values of 0.82 (95% CI, 0.79-0.85) and 0.85 (95% CI, 0.76-0.93), respectively (Table 2). For mixed lesions, agreement was lower with a kappa of 0.61 (95% CI, 0.49-0.74). The mean infarct volume also varied among stroke size classifications within each lesion location category (Supplementary Figure 3 and Supplementary Table 4).
In patients undergoing DWI within 24 hours from the onset time, Cohen’s kappa was 0.81 (95% CI, 0.78-0.88). When the time was extended to 48 hours, the model exhibited similar performance, with Cohen’s kappa of 0.81 (95% CI, 0.78-0.87) (Supplementary Table 5).
Additionally, stroke size predicted by the algorithm was significantly associated with the frequency of symptomatic hemorrhagic transformation (Supplementary Figure 4). The mean processing time from raw image to output in a graphics processing unit (GPU) environment was 5.188 seconds (SD, 0.654) across 100 randomly selected DWI scans.
In this study, we developed and validated a deep learning algorithm to classify stroke size in AF-related stroke using multicenter and multivendor datasets, achieving excellent agreement with stroke experts. To our knowledge, this is the first study to develop a deep learning model that automatically classifies stroke size for severity prediction based on DWI.
Several observational studies have established that the risk of hemorrhagic transformation in AF-related stroke is closely related to infarct size, supporting neuroimaging-based risk stratification to minimize intracranial hemorrhage [7,8]. In addition to the ELAN trial, a recent meta-analysis suggested that early DOAC initiation may reduce recurrent ischemic stroke risk by 36% without increasing intracranial hemorrhage [9]. Our model effectively classified minor, moderate, and major cases separately and showed even higher agreement when categorizing patients into non-major versus major cases. These findings suggest our model could help guide DOAC initiation timing, particularly for physicians with less experience.
Furthermore, the algorithm’s mean processing time from raw DWI input to stroke size classification was approximately 5 seconds (Supplementary Discussion). This rapid processing could facilitate large-scale studies, enabling further research on infarct volume, DOAC initiation, and the risk of intracranial hemorrhage.
In conclusion, this algorithm has the potential to assist less experienced physicians in optimizing DOAC initiation timing and supports the use of large neuroimaging datasets in future research.

Supplementary materials

Supplementary materials related to this article can be found online at https://doi.org/10.5853/jos.2025.00423.
Supplementary Table 1.
Baseline characteristics for training/internal validation and external validation dataset
jos-2025-00423-Supplementary-Table-1,2.pdf
Supplementary Table 2.
Detailed characteristics of MRI vendors and protocols for training/internal validation and external validation dataset
jos-2025-00423-Supplementary-Table-1,2.pdf
Supplementary Table 3.
Agreement of stroke size classification between two vascular neurologists
jos-2025-00423-Supplementary-Table-3-5.pdf
Supplementary Table 4.
Stroke volume of stroke size classification in the datasets for external validation in total, supratentorial, infratentorial, and mixed location
jos-2025-00423-Supplementary-Table-3-5.pdf
Supplementary Table 5.
Subgroup analysis based on symptom onset to imaging time (<24 hr and <48 hr)
jos-2025-00423-Supplementary-Table-3-5.pdf
Supplementary Figure 1.
A flowchart of the patient selection process. MRI, magnetic resonance imaging; DWI, diffusion-weighted imaging; AF, atrial fibrillation; NVAF, nonvalvular atrial fibrillation.
jos-2025-00423-Supplementary-Fig-1.pdf
Supplementary Figure 2.
Deep learning model to classify stroke size. Infarct lesions on diffusion-weighted imaging (DWI) were segmented using a validated 3D U-Net algorithm (JLK-DWI, JLK Inc., Seoul, Korea). The DWI images and segmentation masks were processed into 256×256×64 voxel patches, serving as two-channel inputs (DWI signal intensities and binary segmentation masks) for a 3D adaptation of EfficientNet (EfficientNet3D, efficientnet-b0 configuration). The model was designed to classify stroke size into three categories: minor, moderate, and major.
jos-2025-00423-Supplementary-Fig-2-4.pdf
Supplementary Figure 3.
Log-transformed infarct volume of stroke size classification in the datasets for external validation in supratentorial, infratentorial, and mixed location. Stroke volume tended to increase progressively across minor, moderate, and major stroke size categories within each location. Additionally, stroke volume varied by location, being largest in supratentorial regions. Detailed information and statistical data on stroke volume are provided in Supplementary Table 5.
jos-2025-00423-Supplementary-Fig-2-4.pdf
Supplementary Figure 4.
Cochran-Armitage test between predicted stroke size classification and symptomatic hemorrhagic transformation (sHT). Based on the stroke size classification predicted by the algorithm, cases classified as Major exhibited the highest occurrence rate of actual sHT, while cases classified as Minor showed no sHT. A significant association was observed between the stroke size predicted by the deep learning model and sHT by Cochran-Armitage test.
jos-2025-00423-Supplementary-Fig-2-4.pdf

Notes

Funding statement
None
Conflicts of interest
Hokyu Kim, Hoyoun Lee, and Wi-Sun Ryu are employees of JLK Inc. Dong-Eog Kim reports holding stocks in JLK Inc. Hee-Joon Bae reports holding stocks in JLK Inc., as well as grants from Bayer Korea, Bristol Myers Squibb Korea, Chong Kun Dang Pharmaceutical Corp., Dong-A ST, Korean Drug Co., Ltd., Samjin Pharm, and Takeda Pharmaceuticals Korea Co., Ltd., and personal fees from Amgen Korea, Bayer, Daiichi Sankyo, JW Pharmaceutical, Hanmi Pharmaceutical Co., Ltd., Otsuka Korea, SK Chemicals, and Viatris Korea, outside the submitted work. The other authors report no conflicts of interest.
Author contribution
Conceptualization: WSR, HJB. Study design: WSR, HJB, HK. Methodology: WSR, HJB, HK, HL. Data collection: DYK, HGJ, KJL, BJK, MKH, KHC, DIS, DEK, JMP, KK, JGK, SJL, MSO, KHY, BCL, HKP, KSH, YJC, JCC, SIS, JHH, THP, JHK, WJK, JL, HJB. Investigation: HK, HL. Statistical analysis: HK, WSR, HJB. Writing—original draft: HK. Writing—review & editing: WSR, HJB. Approval of final manuscript: all authors.
Acknowledgments
The authors appreciate the contributions of all members of the Comprehensive Registry Collaboration for Stroke in Korea to this study.

Figure 1.
Receiver operating characteristic (ROC) curves for the classification of stroke size using a deep learning algorithm. (A) Comparisons of the ROC curves for minor, moderate, and major classification in the internal validation dataset. The area under the ROC curve (AUC) values for classifying minor, moderate, and major stroke categories in the internal validation dataset were 0.973, 0.929, and 0.970. (B) Comparisons of the ROC curves for minor, moderate, and major classification in the external validation dataset. The AUC values for classifying minor, moderate, and major stroke categories in the external validation dataset were 0.988, 0.955, and 0.988.
jos-2025-00423f1.jpg
Table 1.
Agreement, confusion matrix, and diagnostic accuracy of stroke size classification between a deep learning algorithm and vascular neurologists
Internal validation (n=1,091) External validation (n=1,265)
Minor, Moderate, Major
 Percentage agreement (95% CI) 87.3 (85.3-89.2) 87.4 (85.4-89.2)
 Cohen’s kappa (95% CI) 0.80 (0.77-0.83) 0.81 (0.78-0.84)
Non-major vs. Major
 Percentage agreement (95% CI) 93.0 (91.4-94.5) 93.8 (92.2-95.0)
 Cohen’s kappa (95% CI) 0.86 (0.82-0.89) 0.86 (0.83-0.89)
Prediction
Average Prediction
Average
Minor Moderate Major Minor Moderate Major
Expert
 Minor 169 13 0 277 59 0
 Moderate 49 360 22 21 434 44
 Major 0 55 423 0 35 395
Sensitivity 0.93 0.84 0.88 0.88 0.82 0.87 0.92 0.87
Specificity 0.95 0.90 0.96 0.93 0.98 0.88 0.95 0.93
PPV 0.78 0.84 0.95 0.86 0.93 0.82 0.90 0.88
NPV 0.99 0.89 0.91 0.93 0.94 0.91 0.96 0.94
Accuracy 0.94 0.87 0.93 0.91 0.94 0.87 0.94 0.92
CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value.
Table 2.
Agreement in stroke size classification between a deep learning algorithm and the consensus of vascular neurologists, categorized by infratentorial, supratentorial, and mixed locations
External validation (n=1,265)
P
Supratentorial (n=1,041) Infratentorial (n=123) Mixed (n=101)
Stroke volume (mL)*, mean±SD 42.28±75.62 6.77±10.07 13.74±23.27 <0.001
Minor, Moderate, Major
 Percentage agreement (95% CI) 88.3 (86.1-90.1) 91.1 (84.2-95.2) 74.3 (64.4-82.2)
 Cohen’s kappa (95% CI) 0.82 (0.79-0.85) 0.85 (0.76-0.93) 0.61 (0.49-0.74)
Non-major vs. Major
 Percentage agreement (95% CI) 94.2 (92.6-95.5) 95.1 (89.2-98.0) 87.1 (78.6-92.7)
 Cohen’s kappa (95% CI) 0.87 (0.84-0.90) 0.90 (0.82-0.98) 0.70 (0.55-0.85)
SD, standard deviation; CI, confidence interval.
* The Kruskal-Wallis test was used.

References

1. Kleindorfer DO, Towfighi A, Chaturvedi S, Cockroft KM, Gutierrez J, Lombardi-Hill D, et al. 2021 Guideline for the prevention of stroke in patients with stroke and transient ischemic attack: a guideline from the American Heart Association/American Stroke Association. Stroke 2021;52:e364-e467.
pmid
2. Kimura S, Toyoda K, Yoshimura S, Minematsu K, Yasaka M, Paciaroni M, et al. Practical “1-2-3-4-day” rule for starting direct oral anticoagulants after ischemic stroke with atrial fibrillation: combined hospital-based cohort study. Stroke 2022;53:1540-1549.
pmid pmc
3. Paciaroni M, Agnelli G, Falocci N, Tsivgoulis G, Vadikolias K, Liantinioti C, et al. Early recurrence and major bleeding in patients with acute ischemic stroke and atrial fibrillation treated with non-vitamin-K oral anticoagulants (RAF-NOACs) study. J Am Heart Assoc 2017;6:e007034.
pmid pmc
4. Fischer U, Koga M, Strbian D, Branca M, Abend S, Trelle S, et al. Early versus later anticoagulation for stroke with atrial fibrillation. N Engl J Med 2023;388:2411-2421.
pmid
5. Zachrison KS, Cash RE, Adeoye O, Boggs KM, Schwamm LH, Mehrotra A, et al. Estimated population access to acute stroke and telestroke centers in the US, 2019. JAMA Netw Open 2022;5:e2145824.
crossref pmid pmc
6. Ryu WS, Kang YR, Noh YG, Park JH, Kim D, Kim BC, et al. Acute infarct segmentation on diffusion-weighted imaging using deep learning algorithm and RAPID MRI. J Stroke 2023;25:425-429.
crossref pmid pmc pdf
7. Paciaroni M, Agnelli G, Ageno W, Caso V. Timing of anticoagulation therapy in patients with acute ischaemic stroke and atrial fibrillation. Thromb Haemost 2016;116:410-416.
crossref pmid
8. Paciaroni M, Bandini F, Agnelli G, Tsivgoulis G, Yaghi S, Furie KL, et al. Hemorrhagic transformation in patients with acute ischemic stroke and atrial fibrillation: time to initiation of oral anticoagulant therapy and outcomes. J Am Heart Assoc 2018;7:e010133.
pmid pmc
9. Palaiodimou L, Stefanou MI, Katsanos AH, De Marchis GM, Aguiar De Sousa D, Dawson J, et al. Timing of oral anticoagulants initiation for atrial fibrillation after acute ischemic stroke: a systematic review and meta-analysis. Eur Stroke J 2024;9:885-895.
crossref pmid pmc pdf


ABOUT JoS
AUTHOR INFORMATION
ARTICLE CATEGORY

Browse all articles >

BROWSE ARTICLES
Editorial Office
Department of Neurology, Asan Medical Center,Ulsan University College of Medicine
88, Olympic-ro 43-gil, Songpa-gu, Seoul 05505, Korea
Submission, status and progress, etc ⟫ E-mail: editor@j-stroke.org
Website and system ⟫ E-mail: journal@m2community.co.kr
Publishing company ⟫ E-mail: ka72sus@smileml.com
Developed in M2PI
Copyright © 2025 by Korean Stroke Society.
Close layer
prev next