Recent research has revealed potential biases in AI-driven tech interview assessments. The study, conducted by Celeste De Nadai as part of her thesis at the Royal Institute of Technology in Sweden, explored whether modern language models exhibit bias when evaluating job candidates based on the cultural implications of their names.
The research found that AI tools tend to rate men with Anglo-Saxon names less favorably. This finding comes from mock interviews for software engineering roles, where AI models assessed responses and exhibited bias influenced by the candidates’ names.
De Nadai’s motivation stemmed from previous reports on bias in AI models. She aimed to investigate newer language models and their propensity for bias, particularly in recruitment contexts. The bias in focal points was notably against male candidates with Anglo-Saxon names, deviating from the typical expectation of preferring Western names.
The study, leveraging advanced models like Google’s Gemini-1.5 and OpenAI’s GPT4o-mini, showed that despite improvements in AI, inherent biases persist. The research utilized 24 interview questions, with answers altered using different names across cultural backgrounds to record any bias shift. The models were tested extensively, altering variables such as the temperature settings that influence predictability.
According to De Nadai, some AI recruiting firms claim bias-free operations due to a lack of name data in their evaluations. However, the study suggests that language markers still allow AI systems to infer cultural backgrounds, advocating for masking names and gender information to reduce bias.
The study concludes that addressing model biases requires more than just adjusting settings. It encourages the inclusion of strict criteria in prompts to guide unbiased evaluations. The research highlights continuing challenges in achieving fair AI assessments in recruitment settings while industry giants like Google and OpenAI have yet to comment on these findings.