An analysis of freely accessible skin image data sets that are available for training machine learning algorithms to detect skin cancer has shown that darker skin types are significantly underrepresented in the databases, British researchers report.
Of 106,950 skin lesions documented in 21 open access databases and 17 open access atlases identified by David Wen, BMBCh, University of Oxford, UK and colleagues, 2436 images contained Fitzpatrick skin type information . Of these, “only ten images were from people with Fitzpatrick skin type V and only one image was from a person with Fitzpatrick skin type VI,” the researchers said. “The ethnicity of these people was either Brazilian or unknown.”
In two datasets that contained 1,585 images with ethnic data, “there were no images of people with an African, Afro-Caribbean, or South Asian background,” noted Wen and colleagues. “In connection with the geographical origin of the data sets, there was a massive underrepresentation of skin lesion images from population groups with darker skin.”
The results of their systematic review were presented at the National Cancer Research Institute (NCRI) Festival and published in The Lancet Digital Health on November 9th. To the best of their knowledge, “they write, this is” the first systematic review of publicly available skin lesion images, which include predominantly dermatoscopic and macroscopic images available through open access datasets and atlases “.
In total, 11 of 14 records (79%) were from North America, Europe or Oceania among the records with information about the country of origin, the researchers said. In 19 out of 21 (91%) datasets, either dermatoscopic images or macroscopic photographs were the only available image types. There were some differences in the clinical information available: 81,662 images (76.4%) contained information about age, 82,848 images (77.5%) contained information related to gender, and 79,561 images contained information related to body location (74.4%).
The researchers stated that these datasets could be of limited use in a real-world setting where the images are not representative of the population. Artificial intelligence (AI) programs that exercise, for example, on images of patients with one skin type, may misdiagnose patients of another skin type, they said.
“AI programs hold great potential for diagnosing skin cancer because they display images and can quickly and inexpensively assess all areas of concern on the skin,” Wen said in a press release from the NCRI festival. “However, it is important to know the images and patients that are used to develop programs, as these will affect what groups of people the programs will be most effective for in practice. Research has shown that programs are trained on images that are of people with lighter skin types may not be as accurate for people with darker skin, and vice versa. “
There is also “limited information about who, how and why the pictures were taken,” Wen said in the press release. “This has an impact on the programs developed from these images due to the uncertainty about how they can work with different groups of people, especially those who are not well represented in data sets such as B. in those with darker skin color. This can potentially lead to the exclusion or even damage of these groups of AI technologies. “
While there are no current guidelines for developing skin image records, according to the researchers, quality standards are required.
“Ensuring equitable digital health involves building unbiased, representative data sets to ensure that the algorithms created benefit people of all backgrounds and skin types,” they conclude in the study.
Neil Steven, MBBS, MA, PhD, FRCP, a member of the NCRI Skin Group who was not involved in the research, stated in the press release that the results of the Wen and colleagues study “raise concern about the ability of AI to in skin cancer diagnosis, especially in a global context. “
“I hope that this work will continue and will help ensure that the advances we are making in the use of AI in medicine benefit all patients as human skin color is very different,” said Steven, Honorary Consultant in Medical Oncology at University Hospitals Birmingham NHS Foundation Trust, United Kingdom.
Health of the lancet digit. Published online November 9, 2021. Full text
This study was funded by the NHSX and the Health Foundation. Three authors stated that they were paid employees of Databiology at the time of the study. The other authors reported no relevant financial relationships.
For more news, follow Medscape on Facebook, Twitter, Instagram, YouTube and LinkedIn.