Cook, Cynthia M., Howard, John J., Sirotin, Yevgeniy B., Tipton, Jerry L., and Vemury, Arun R.
IEEE Transactions on Biometrics, Behavior, and Identity Science (IEEE T-BIOM) Volume: 1 Issue: 1, February 2019, DOI: 10.1109/TBIOM.2019.2897801
Publication year: 2019

We examined the effect of demographic factors on the performance of the eleven commercial face biometric systems tested as part of the 2018 United States Department of Homeland Security, Science and Technology Directorate (DHS S&T) Biometric Technology Rally. Each system that participated in this evaluation was tasked with acquiring face images from a diverse population of 363 subjects in a controlled environment. Biometric performance was assessed using a systematic, repeatable test process measuring both efficiency (transaction times) and accuracy (mated similarity scores using a leading commercial algorithm). Prior works have documented differences in biometric algorithm performance across demographic categories and proposed that skin phenotypes offer a superior explanation for these differences. To test this concept, we developed an automatic method for measuring relative facial skin reflectance using subjects’ enrollment images and quantified the effect of this metric and other demographic covariates on performance using linear modeling. Both the efficiency and accuracy of the tested acquisition systems were significantly affected by multiple demographic covariates including skin reflectance, gender, age, eyewear, and height. Skin reflectance had the strongest net linear effect on performance. Linear modeling showed that lower (darker) skin reflectance was associated with lower efficiency (higher transaction times) and accuracy (lower mated similarity scores). Skin reflectance was also a statistically better predictor of these effects than self-identified race labels. Unlike other covariates, the degree to which skin reflectance altered accuracy varied between systems. We show that the size of this skin reflectance effect was inversely related to the overall accuracy of the system such that the effect was almost negligible for the system with the highest overall accuracy. These results suggest that, in evaluations of biometric accuracy, the magnitude of measured demographic effects depends on image acquisition.