A Convolutional Neural Network and Vision Transformer Based Framework for Effective Detection of Liver Cancer

Asma Zahoor; Erssa Arif; Naila Nawaz; Muhammad Amjad; Shahrukh Hamayoun; Arslan Baig

Authors

Asma Zahoor Riphah College of Computing, Riphah International University, Faisalabad, Pakistan.
Erssa Arif Riphah College of Computing, Riphah International University, Faisalabad, Pakistan.
Naila Nawaz Riphah College of Computing, Riphah International University, Faisalabad, Pakistan.
Muhammad Amjad Riphah College of Computing, Riphah International University, Faisalabad, Pakistan.
Shahrukh Hamayoun Department of Computer Science, National University of Modern Languages, Faisalabad, Pakistan
Arslan Baig Riphah College of Computing, Riphah International University, Faisalabad, Pakistan.

Keywords:

Liver Cancer Detection, Hepatocellular Carcinoma (HCC), Computed Tomography (CT) Imaging, Vision Transformers (ViTs), EfficientNet-B0, TinyViT, MobileViTv2, Medical Image Analysis, Clinical Decision Support Systems, Early Cancer Diagnosis

Abstract

Liver cancer, particularly hepatocellular carcinoma (HCC), remains one of the most prevalent and lethal malignancies worldwide, underscoring the urgent need for early and reliable diagnostic solutions. Conventional diagnostic methods using computed tomography (CT) imaging are often limited by inter-observer variability and the high cognitive burden on radiologists. To address these challenges, this study proposes a hybrid deep learning framework that leverages Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) for effective liver cancer detection. The research employs the publicly available 3D-IRCADb1 dataset of contrast-enhanced CT scans, with preprocessing and augmentation techniques applied to enhance model generalization. Three state-of-the-art architectures, EfficientNet-B0, TinyViT, and MobileViT v2, were trained and evaluated to assess their diagnostic performance. Among these, MobileViT v2 demonstrated superior performance and efficiency in classification tasks. To enhance clinical trust, Gradient-weighted Class Activation Mapping (Grad-CAM) was integrated to provide visual explanations of model predictions, highlighting regions of interest corresponding to tumor areas. The findings indicate that the proposed framework not only ensures robust diagnostic capability but also introduces interpretability and efficiency, making it suitable for deployment in clinical and resource-constrained environments. This research contributes to advancing AI-driven liver cancer diagnostics by bridging the gap between performance and transparency, ultimately supporting earlier detection and improved patient outcomes.

A Convolutional Neural Network and Vision Transformer Based Framework for Effective Detection of Liver Cancer

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

SCOPUS

HJRS

ISSN

Online First

Call for Papers

Make a Submission

Open Access

Information

Conference

SC-2