Next Steps Toward Machine Learning in Dermatology

Close-up of a seborrheic keratosis.
Close-up of a seborrheic keratosis.
Improved accuracy in the diagnosis of dermatologic conditions may be achieved through machine learning technology.

In recent years, there has been an increasing focus on the capacity of machine learning technology (MLT) to transform a range of processes and outcomes in medicine. Emerging evidence highlights the potential benefits of MLT applied to dermatology.

In a paper published in Nature in February 2017, a group of researchers at Stanford University reported on the use of a convolutional neural network (CNN) in classifying keratinocyte carcinomas vs benign seborrheic keratoses and malignant melanomas vs benign nevi.1 The CNN was trained from a dataset of 129,450 clinical images representing 2032 diseases, and the accuracy of its performance was found to be comparable to that of 21 board-certified dermatologists.

In this study, the MLT was “supervised,” as it was “trained by the analysis of a large data set of photographs and corresponding correct diagnoses, thereafter capable of correctly diagnosing images unknown to the model,” explained the authors of a recent commentary published online in the Journal of the American Academy of Dermatology.2 “In contrast, unsupervised learning draws inferences from data with unknown outcomes to find intrinsic meaning.” The creative and effective use of such “big data” has the potential to significantly improve outcomes and care delivery, and could have an annual impact of $300 billion in the healthcare sector.3

These gains will require “smart data collection and analysis… in the form of automated data collection tools paired with [MLT], algorithms that can mine unimaginably large data sets to recognize patterns and predict outcomes,” wrote the authors.2 Data quantity and quality represent immediate barriers to integrating and applying MLT in dermatology, as large validated data sets are required for optimal performance. There is a need for standardized data collection across practices, although developing this system may initially involve significant changes to workflows.

“Clinical care has always been influenced by data, but there has been an explosion of data that is now increasingly integrated into care decisions,” said Robert A. Swerlick, MD, the Alicia Leizman Stonecipher Chair of Dermatology, and professor and chairman of the department of dermatology at Emory University School of Medicine, who coauthored the commentary.2 “Those who manage this massive expansion most adeptly and translate more and better information into improved clinical outcomes will garner more resources to devote to care,” he told Dermatology Advisor. This will become increasingly relevant as data collection sources begin to extend beyond strictly clinical settings – for example, to the roughly 6 billion smartphones expected to be in use by the year 2021.1

Dr Swerlick draws a parallel between medicine’s commitment to science and the clear value of data in advancing the field. “Placing medicine upon a scientific foundation more than 100 years ago propelled physicians into their preeminent position in the healthcare universe,” he stated. “Maintaining that status will require placing a similar ‘bet’ upon the power of data to drive the next revolution in care.”

He and his coauthors propose that the American Academy of Dermatology’s (AAD’s) DataDermTM could eventually serve as an MLT data hub for the field of dermatology, using data from electronic health record systems “to improve patient care, advance our understanding of skin diseases, increase the power of studies, and enhance the standing of dermatology among specialties.”2 Considering the success of the American Academy of Ophthalmology’s data collection service, the IRIS® (Intelligent Research in Sight) Registry, DataDerm could ultimately enable real-time performance tracking for dermatologists, as well as the ability to run queries about specific skin diseases in specific patient populations.4

Steady progress toward the optimal use of MLT and DataDerm will require collective effort in the following action steps.

Related Articles

Input consistent data points within and across institutions. There must be standardized procedures for data entry by providers, and the data must be unambiguous. “These efforts are and should be driven by the AAD, dermatology specialty organizations, and other stakeholders,” according to Dr Swerlick and colleagues.2

Integrate diverse data sources. Due to the wide variety of skin diseases, it is unlikely that providers will be able to consistently capture disease outcomes in routine clinical settings. Clinicians should consider greater use of patient-recorded outcome measures, which have been shown to correlate closely with disease severity.5 The increased use of such data, along with the incorporation of mobile applications, may improve data collection and reduce provider burden.

Aggregate skin disease registries. With aggregated data from existing skin disease registries (of which there are at least 48), unsupervised MLT “can facilitate the concurrent analysis of multiple diseases alongside other metrics, such as cost, quality, and other conditions,” as described in the commentary.2

Use DataDerm for research. While sufficient data accumulate to allow for reliable MLT interpretations, the vast amount of de-identified clinical data could prove useful to researchers in the meantime and beyond.

Spread the word. High-quality data facilitate optimal performance and improved patient outcomes. While DataDerm has been successfully integrated in some practice settings, scaling up to larger health systems will require clarity regarding issues such as security and data ownership. 

Dr Swerlick reiterates that the creation of data standards should be the next step in facilitating the use of registries such as DataDerm to integrate databases and apply big data tools such as MLT. “The momentum may be slow at first, but improvements and wins will accumulate in an iterative fashion,” he said. “The combination of standards and registries will allow for addressing fundamental questions in research, clinical care, and performance improvement previously not attainable.”


  1. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118.
  2. Park AJ, Ko JM, Swerlick RA. Crowdsourcing dermatology: DataDerm, big data analytics, and machine learning technology [published online October 14, 2017]. J Am Acad Dermatol. doi:10.1016/j.jaad.2017.08.053
  3. Manyika J, Chui M, Brown B, et al. Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute; 2011. Accessed December 26, 2017.
  4. Parke DW II, Lum F, Rich WL. The IRIS® Registry: purpose and perspectives. Ophthalmologe. 2017;114(Suppl 1):1-6.
  5. Resneck JS Jr, VanBeek M. What dermatology still needs to create meaningful patient outcome measurements. JAMA Dermatol. 2015;151(4):371-372.