logo

IBM hopes 1 million faces will help fight bias in facial recognition

Key Points
  • IBM released a trove of data containing 1 million images of faces taken from a Flickr dataset with 100 million photos and videos.
  • The images are annotated with tags related to features including craniofacial measurements, facial symmetry, age and gender.
  • Researchers at IBM hope this will help developers train their AI-powered facial recognition systems to identify faces more fairly and accurately.
An annotated image from IBM's Diversity in Faces dataset for facial recognition systems.
IBM

IBM thinks the data being used to train facial recognition systems isn't diverse enough.

The tech giant released a trove of data containing 1 million images of faces taken from a Flickr dataset with 100 million photos and videos.

The images are annotated with tags related to features including craniofacial measurements, facial symmetry, age and gender.

Researchers at the company hope that these specific details will help developers train their artificial intelligence-powered facial recognition systems to identify faces more fairly and accurately.

"Facial recognition technology should be fair and accurate," John Smith, a fellow and lead scientist at IBM, told CNBC by email. "In order for the technology to advance it needs to be built on diverse training data."

Smith stressed the importance of variety in datasets for facial recognition systems to reflect real-world diversity and reduce the rate of error in matching a face to a person.

"Many prominent datasets used in the field are too narrow and fall short in coverage and balance," he said. "The data does not reflect the faces we see in the world."

Experts have warned on the potential for artificial intelligence to be biased. Research has shown that facial recognition technology is much more adept at making out the faces of white males than it is with minorities.

IBM itself has been the target of criticism over its facial recognition system. A paper by MIT researcher Joy Buolamwini, published last year, found that IBM Watson's visual recognition platform had an almost 35 percent error rate when it came to identifying darker-skinned females, and a less than 1 percent error rate for identifying lighter-skinned males.

IBM said that Watson's updated visual recognition service "uses broader training datasets and more robust recognition capabilities" than the one evaluated by the MIT researcher in her initial study. In a recently updated version of Buolamwini's research, the academic noted improvement in IBM's facial recognition system when it came to identifying darker-skinned females — however the error rate for that category still stood at almost 17 percent.

Studies like Buolamwini's have heightened concerns over the use of facial recognition in areas like law enforcement, and the potential for AI-powered racial profiling. The U.K.'s Metropolitan Police is testing facial recognition, while Chinese AI firm SenseTime assists local authorities in identifying crime suspects with the use of its facial recognition technology.

A 2016 report by the Center for Privacy and Technology at Georgetown University's law school said that African Americans would be disproportionately affected by police face recognition systems as they are disproportionately targeted for arrests.

VIDEO7:2907:29
IBM CEO: Over-regulation could put the digital economy at risk