Deep Vision for Early Detection: A Tailored 19-Layer CNN for Oral Cancer Screening Using Lip and Tongue Images

By Xiaoyu Bai Updated on 2025-08-01 16:19:18 0 0

By Xiaoyu Bai

Updated on 2025-08-01 16:19:18 0 0

Introduction: The Urgency of Early Oral Cancer Detection

Oral cancer remains a significant global health burden, particularly in low- and middle-income countries where it ranks among the leading causes of cancer-related morbidity and mortality. In 2022 alone, more than 389,000 new cases and nearly 190,000 deaths were reported worldwide, with disproportionate prevalence observed in regions such as South Asia and sub-Saharan Africa. Early detection plays a critical role in improving survival rates; however, most cases continue to be diagnosed at advanced stages, when prognosis is poor and treatment options are limited.

Conventional visual screening methods, although widely used, often fall short due to high interobserver variability and their inability to reliably detect premalignant lesions. This limitation is particularly acute in rural and resource-constrained settings where access to trained professionals and diagnostic tools is scarce. In response, artificial intelligence (AI)—and specifically deep learning—has emerged as a transformative solution. Convolutional neural networks (CNNs) have demonstrated remarkable capacity for image-based classification across numerous medical domains, offering a scalable, objective, and highly accurate approach to early cancer detection.

This article introduces a novel 19-layer CNN architecture specifically designed for screening oral cancer using clinical images of the lips and tongue. By addressing the shortcomings of traditional screening methods, this deep learning framework aims to support timely and reliable diagnosis, especially in underserved populations. Through the integration of AI-driven image analysis, the model represents a significant step forward in improving oral cancer detection and reducing the global burden of this often overlooked malignancy.

Model Design: Building the 19-Layer CNN Architecture

The 19-layer convolutional neural network developed for oral cancer screening is carefully designed to balance computational efficiency with diagnostic accuracy. Image preprocessing serves as a crucial first step, involving resizing to standardized dimensions, min-max normalization, and histogram-based color enhancement. These measures improve the visibility of features critical for accurate lesion classification.

The core of the architecture consists of a sequence of convolutional layers paired with ReLU activation functions and max pooling operations. These layers enable the network to extract and learn hierarchical features from input images, ranging from basic edges and textures to more complex lesion patterns associated with premalignant and malignant changes. To improve generalizability and reduce the risk of overfitting—especially important when training on limited clinical datasets—intermediate dropout layers are strategically incorporated.

The model architecture concludes with fully connected dense layers and a final softmax or sigmoid classifier, depending on whether the task involves binary or multiclass categorization. The selection of 19 layers reflects a deliberate balance: deep enough to capture complex visual features, yet constrained to avoid the overfitting and computational overhead associated with excessively deep networks.

Hyperparameters such as learning rate, batch size, and optimization method were carefully tuned. A learning rate around 0.001, batch size of 32, and the use of the Adam optimizer ensured rapid convergence during training. Compared to prominent transfer learning models such as VGG19 and ResNet50, the 19-layer CNN consistently outperformed in accuracy, sensitivity, and specificity on a publicly available dataset of oral cancer lip and tongue images. These results highlight the model’s suitability for clinical deployment, particularly for non-invasive screening in primary care or telemedicine settings.

Results: Model Performance and Interpretability

The proposed 19-layer CNN demonstrated exceptional diagnostic performance in detecting oral cancer from lip and tongue images. The model achieved an accuracy of 99.54%, sensitivity of 95.73%, specificity of 96.21%, and an F1-score of 96.03%, surpassing baseline models including Support Vector Machines (SVM), ResNet-50, and VGG19 under identical experimental conditions.

Critically, the model exhibited strong performance in differentiating between early carcinoma and premalignant lesions such as leukoplakia and erythroplakia, even under suboptimal image quality conditions including poor lighting or partial occlusions. This robustness suggests high utility in real-world screening scenarios, where image variability is common.

Model interpretability—a key requirement for clinical AI tools—was addressed using Gradient-weighted Class Activation Mapping (Grad-CAM) and saliency maps. These techniques visualized the areas of the input image that the model relied upon for its predictions, confirming that the network focused on clinically relevant regions. Such visual explanations enhance transparency and support clinician confidence in AI-assisted decision-making.

External validation using independent datasets from different clinical settings further confirmed the model’s generalizability and stability, an essential criterion for real-world implementation. These findings, in line with recent literature on AI in cancer diagnostics, suggest that the tailored 19-layer CNN not only delivers state-of-the-art accuracy but also meets the demands of interpretability, accessibility, and scalability.

By enabling non-invasive, rapid, and accurate oral cancer screening, particularly in underserved areas, this model paves the way for broader use of AI in frontline oncology diagnostics—helping to bridge the gap in early cancer detection and improve patient outcomes globally.

Reference:

Wu J, Chen H, Liu Y, Yang R, An N. The global, regional, and national burden of oral cancer, 1990-2021: a systematic analysis for the Global Burden of Disease Study 2021. J Cancer Res Clin Oncol. 2025 Jan 28;151(2):53. doi: 10.1007/s00432-025-06098-w. PMID: 39875744; PMCID: PMC11775039.
Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024; 74(3): 229-263. doi:10.3322/caac.21834
Tranby EP, Heaton LJ, Tomar SL, Kelly AL, Fager GL, Backley M, Frantsve-Hawley J. Oral Cancer Prevalence, Mortality, and Costs in Medicaid and Commercial Insurance Claims Data. Cancer Epidemiol Biomarkers Prev. 2022 Sep 2;31(9):1849-1857. doi: 10.1158/1055-9965.EPI-22-0114. PMID: 35732291; PMCID: PMC9437560.
Sun R, Dou W, Liu W, et al. Global, regional, and national burden of oral cancer and its attributable risk factors from 1990 to 2019. Cancer Med. 2023; 12: 13811-13820. doi:10.1002/cam4.6025
Liu P, Bagi K. A tailored deep learning approach for early detection of oral cancer using a 19-layer CNN on clinical lip and tongue images. Sci Rep. 2025 Jul 4;15(1):23851. doi: 10.1038/s41598-025-07957-9. PMID: 40615563; PMCID: PMC12227544.
Wei, X., Chanjuan, L., Ke, J. et al. Convolutional neural network for oral cancer detection combined with improved tunicate swarm algorithm to detect oral cancer. Sci Rep 14, 28675 (2024). https://doi.org/10.1038/s41598-024-79250-0
Li XL, Zhou G. Deep Learning in the Diagnosis and Prognosis of Oral Potentially Malignant Disorders. Cancer Screen Prev. 2024;3(4):203-213. doi: 10.14218/CSP.2024.00025.
Lepper TW, Amaral LND, Espinosa ALF, Guedes IC, Rönnau MM, Daroit NB, Haas AN, Visioli F, Oliveira Neto MM, Rados PV. Cytopathological quantification of NORs using artificial intelligence to oral cancer screening. Braz Oral Res. 2025 May 12;39:e056. doi: 10.1590/1807-3107bor-2025.vol39.056. PMID: 40367024; PMCID: PMC12074076.
Ramani RS, Tan I, Bussau L, O'Reilly LA, Silke J, Angel C, Celentano A, Whitehead L, McCullough M, Yap T. Convolutional neural networks for accurate real-time diagnosis of oral epithelial dysplasia and oral squamous cell carcinoma using high-resolution in vivo confocal microscopy. Sci Rep. 2025 Jan 20;15(1):2555. doi: 10.1038/s41598-025-86400-5. PMID: 39833362; PMCID: PMC11746977.
Mirfendereski P, Li GY, Pearson AT, Kerr AR. Artificial intelligence and the diagnosis of oral cavity cancer and oral potentially malignant disorders from clinical photographs: a narrative review. Front Oral Health. 2025 Mar 10;6:1569567. doi: 10.3389/froh.2025.1569567. PMID: 40130020; PMCID: PMC11931071.