Can an Online Service Predict Gender? On the State-of-the-Art in Gender Identification from Texts
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE Press
Abstract
Gender equality initiatives are often faced with a problem: In order to determine whether initiatives are successful the gender of individuals in the target group must be known. As self-identification inherently has the problems that individuals have to respond and results may, therefore, be biased and incomplete, the temptation to use automated gender identification methods is evident. In the scientific literature, multiple sources ranging from the individual's name, their social media choices, biological features (e.g., brain scans or fingerprints), to texts attributed to the individual are used for automated gender identification with varying success. In this paper, we systematically inspect scientific publications for gender prediction based on textual data which are published between January 2017 and January 2019 in order to determine if such approaches may supply viable means to reliably determine an author's gender. However, we find that the best approach in the current state-of-the-art works with an accuracy of only 93.4%. Moreover, we discuss the possible harm that gender identification systems might entail due to their inaccuracy and also given that they are assuming a binary gender model. We conclude that gender identification based on textual data is currently no reliable substitute for self-identification.


