Caste And Occupational Identity In Large Language Models

Jarul Zaveri and Arpit Shah

2025

Working Paper No

724

Body

A large body of scholarship has documented evidence of racial and gender biases in large language models (LLMs). In this work, we examine three types of LLM biases in the context of caste and occupational identity in India through five studies. Our studies cover a comprehensive set of occupations in India and test for bias across all of India's districts. Our results provide four key insights. First, we find representation bias such that individuals from marginalized caste groups are significantly under-represented in LLM output compared to their share in India's working population. This potentially reflects India's digital divide. Second, corrective measures to increase representation introduce other sources of errors. Corrective measures can also lead to association bias where marginalized castes are linked to occupations that require lower education levels and provide lower pay. Third, the models also demonstrate selection bias with a higher probability of shortlisting resumes with names from dominant caste groups. Finally, we propose a training approach by which selection bias can be reduced in LLM shortlisting. Our work is highly relevant at a time when generative AI is becoming increasingly important in recruitment and hiring processes as a cost-saving measure.

Key words

large language models; caste; bias; occupational identity; India

WP No. 724.pdf (727.59 KB)

Caste And Occupational Identity In Large Language Models

Author(s) Name: Jarul Zaveri and Arpit Shah, 2025

Working Paper No : 724

Abstract:

Keywords: large language models; caste; bias; occupational identity; India

WP No. 724.pdf (727.59 KB)

Certificate Programmes

UG Programmes

CENTRES OF EXCELLENCE

IIMB Management Review

Journal of Indian Institute of Management Bangalore

CENTRES OF EXCELLENCE

Centres Of Excellence

Certificate Programmes

UG Programmes

Faculty

IIMB Institutional Review Board (IRB)

IIMB Institutional Review Board (IRB)

IIMB Management Review

Journal of Indian Institute of Management Bangalore

About IIMB

Caste And Occupational Identity In Large Language Models

Caste And Occupational Identity In Large Language Models

Contact us