A Convolutional Neural Network and Vision Transformer Based Framework for Effective Detection of Liver Cancer
Keywords:
Liver Cancer Detection, Hepatocellular Carcinoma (HCC), Computed Tomography (CT) Imaging, Vision Transformers (ViTs), EfficientNet-B0, TinyViT, MobileViTv2, Medical Image Analysis, Clinical Decision Support Systems, Early Cancer DiagnosisAbstract
Liver cancer, particularly hepatocellular carcinoma (HCC), remains one of the most prevalent and lethal malignancies worldwide, underscoring the urgent need for early and reliable diagnostic solutions. Conventional diagnostic methods using computed tomography (CT) imaging are often limited by inter-observer variability and the high cognitive burden on radiologists. To address these challenges, this study proposes a hybrid deep learning framework that leverages Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) for effective liver cancer detection. The research employs the publicly available 3D-IRCADb1 dataset of contrast-enhanced CT scans, with preprocessing and augmentation techniques applied to enhance model generalization. Three state-of-the-art architectures, EfficientNet-B0, TinyViT, and MobileViT v2, were trained and evaluated to assess their diagnostic performance. Among these, MobileViT v2 demonstrated superior performance and efficiency in classification tasks. To enhance clinical trust, Gradient-weighted Class Activation Mapping (Grad-CAM) was integrated to provide visual explanations of model predictions, highlighting regions of interest corresponding to tumor areas. The findings indicate that the proposed framework not only ensures robust diagnostic capability but also introduces interpretability and efficiency, making it suitable for deployment in clinical and resource-constrained environments. This research contributes to advancing AI-driven liver cancer diagnostics by bridging the gap between performance and transparency, ultimately supporting earlier detection and improved patient outcomes.
Downloads
Published
How to Cite
Issue
Section
License
This is an open Access Article published by Research Center of Computing & Biomedical Informatics (RCBI), Lahore, Pakistan under CCBY 4.0 International License