Background: Genomic signatures are crucial for informing thyroid cancer management. Concurrently, digitisation of pathological images has enabled quantitative machine-based analysis. Machine learning (ML) algorithms using histomic signature as their input can be potentially trained to recognise genetic mutations. Histomics involves extracting quantitative features from histological images, providing insights into tissue morphology, cellular organization, and disease pathology.
Methods: De-identified glass slides of thyroid cancer specimens from the anatomical pathology archives at Royal North Shore Hospital were analysed using Clustering-Constrained Attention Multiple-Instance Learning (CLAM) and attention-based multiple instance learning (AMIL), two state-of-the-art ML techniques combining multiple-instance learning and attention mechanisms. The models were trained and tested to differentiate BRAF wild-type (BRAFwt) papillary thyroid carcinoma (PTC) specimens from BRAFV600E mutant PTC specimens.
Results: We compared 211 BRAFwt PTC specimens with 304 BRAFV600E PTC specimens. In the BRAFwt cohort, mean tumour size was 22.41 ± 17.47 mm, mean age was 45.13 ± 16.79 y, and 50 (24%) were male. In the BRAFV600E cohort, mean tumour size was 22.41 ± 17.47 mm, mean age was 50.32 ± 15.92 y, and 82 (27%) were male. Eighty percent of cases were used for training/validation and 20% for testing. The CLAM model achieved an AUC of 0.79 (95% CI: 0.77-0.82) while AMIL yielded an AUC of 0.78 (95% CI: 0.75-0.80) in identifying BRAFV600E cases.
Discussion: The application of CLAM and AMIL, not previously utilised for thyroid cancer, introduces a novel approach to genotype identification in thyroid cancer. Our study demonstrates how these innovative ML techniques can identify BRAFV600E genotype in PTC. Further work is required to validate these machine-learning tools in prospective, longitudinal settings and to determine whether the BRAF-like gestalt identified by these algorithms contains additional prognostic information.