In a study conducted as an undergraduate thesis at Sweden’s Royal Institute of Technology, researchers explored the bias in modern AI models evaluating tech interview responses. These models reportedly rated men, particularly those with Anglo-Saxon names, less favorably. Lead researcher Celeste De Nadai aimed to investigate biases in current-generation LLMs when handling gender data and culturally inferable names. Despite claims from AI recruiting startups that language models are void of bias, De Nadai argues otherwise. Using models like GPT4o-mini and Google’s Gemini-1.5, the study analyzed AI ratings for responses to 24 interview questions, varying parameters like temperature settings. The findings highlight an unexpected bias against men with Anglo-Saxon names, contrary to the anticipated favoritism, suggesting overcorrection in AI outputs. The research suggests robust, criteria-driven approaches to mitigate such biases and recommends masking identity traits, such as names and gender, during evaluations. This study emphasizes that AI fairness remains an ongoing challenge, urging further efforts to balance automated decision-making in recruitment.