With rapid advances in the use of machine learning in the past several years, there have been exciting developments in the field of dermatology. In recent studies, a deep learning model called the convolutional neural network (CNN) has shown impressive accuracy in the automated classification of certain types of cutaneous lesions.
In a 2018 study published in the Journal of Investigative Dermatology, Han et al, tested the performance of a CNN using different datasets in the detection of basal cell carcinoma, squamous cell carcinoma, intraepithelial carcinoma, and melanoma.1 Two of the datasets (Asan and Edinburgh) demonstrated comparable performance to that of 16 dermatologists, and the authors wrote that the accuracy of these networks could be further improved with the addition of images representing a broader range of ages and ethnicities.
A 2017 study by researchers at Stanford University showed similar results with a CNN trained with 129,450 clinical images representing 2032 diseases.2 They compared the performance of this model to that of 21 board-certified dermatologists in differentiating keratinocyte carcinomas vs benign seborrheic keratoses and malignant melanomas vs benign nevi. “The CNN achieves performance on par with all tested experts across both tasks, demonstrating an artificial intelligence capable of classifying skin cancer with a level of competence comparable to dermatologists,” the authors reported.
They noted the implications for the use of such networks on mobile devices: “It is projected that 6.3 billion smartphone subscriptions will exist by the year 2021 and can therefore potentially provide low-cost universal access to vital diagnostic care.”2 In addition to improving early detection rates, automated skin cancer screening would likely result in increased referrals to dermatologists and could lead to greater efficiency of in-office visits.3
However, many challenges remain to be resolved before artificial intelligence (AI) can be widely implemented for this purpose. One major limitation is the variable efficacy of different algorithms depending on the set of images used to train them. “Each model may have different sensitivities and specificities and may be subject to a unique set of biases and shortcomings in prediction introduced by the training set of images,” as stated in a 2018 paper published by Zakhem et al, in JAMA Dermatology.3
Additionally, although the model described in the study by Han et al, has “already been optimized for mobile use and can be accessed without a subscription or login (http: //modelderm.com/)….[it is unclear]…whether the site is designed for patients, nonspecialist clinicians, or dermatologists.”3 The design of the user interface should be based on the intended user. For example, sites intended for patients should provide information regarding diagnostic follow-up, if needed, as well as educational materials and explanations of the model’s predictions.