ViT-BT: Improving MRI Brain Tumor Classification Using Vision Transformer with Transfer Learning
Khawla Hussein Ali

Khawla Hussein Ali, Department of Computer Science, University of Basrah, Iraq.

Manuscript received on 19 August 2024 | Revised Manuscript received on 28 August 2024 | Manuscript Accepted on 15 September 2024 | Manuscript published on 30 September 2024 | PP: 16-26 | Retrieval Number: 100.1/ijsce.D364414040924 | DOI: 10.35940/ijsce.D3644.14040924

Open Access | Editorial and Publishing Policies | Cite | Zenodo | OJS | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: This paper presents a Vision Transformer designed for classifying brain tumors (ViT-BT), offering a novel methodology to enhance the classification of brain tumor MRI scans through transfer learning with Vision Transformers. Although traditional Convolutional Neural Networks (CNNs) have demonstrated significant capabilities in medical imaging, they often need help to grasp the global contextual information within images. To address this limitation, we utilize Vision Transformers, which excel at capturing long-range dependencies due to their self-attention mechanism. In the case of ViT-BT, the Vision Transformer model undergoes pre-training followed by fine-tuning on specific MRI brain tumor datasets, thereby improving its capability to classify various brain tumor types. Experimental results indicate that ViT-BT outperforms other CNN-based methods, delivering superior accuracy and resilience. Evaluations were performed using the BraTS 2023 dataset, comprising multi-modal MRI images of brain tumors, including T1-weighted, T2-weighted, T1CE, and Flair sequences. The ViT-BT model showcased remarkable performance, achieving precision, recall, F1-score, and accuracy rates of 97%, 99%, 99.41%, and 98.17%, respectively. This advancement is anticipated to significantly enhance diagnostic accuracy in clinical settings, ultimately leading to improved patient outcomes. The research underscores the potential of transfer learning with Vision Transformers in medical imaging as a promising avenue for future exploration across various medical domains.

Keywords: Deep learning, Vision Transformer (ViT), VGG16, EfficientNet-B7, Transfer Learning.
Scope of the Article: Artificial Intelligence