LLM Stereotypes Mirror Lived Disability Experiences

Category: User-Centred Design · Effect: Strong effect · Year: 2023

Large Language Models often perpetuate subtle, harmful stereotypes about disability that mirror real-world biases, rather than overtly offensive content.

Design Takeaway

Prioritize inclusive data sourcing and user-centered evaluation methods that capture nuanced biases, not just overt offensiveness, when designing AI systems.

Why It Matters

Designers developing AI-powered tools must recognize that 'non-offensive' doesn't equate to 'unbiased.' Understanding how LLMs can subtly reinforce negative stereotypes is crucial for creating inclusive and equitable user experiences.

Key Finding

People with disabilities interacting with an AI language model found it reflected subtle, harmful stereotypes they encounter daily, rather than overtly offensive content, suggesting a need for better training data.

Key Findings

Research Evidence

Aim: To identify categories of harms perpetuated by Large Language Models (LLMs) towards the disability community from their perspective.

Method: Qualitative research using focus groups and annotation.

Procedure: Researchers conducted 19 focus groups with 56 participants with disabilities. Participants interacted with a dialog model, discussing and annotating its responses related to disability.

Sample Size: 56 participants

Context: Artificial Intelligence (AI) development, specifically Large Language Models (LLMs) and their societal impact.

Design Principle

AI systems should be evaluated not only for overt harmfulness but also for their subtle reinforcement of societal stereotypes, especially concerning marginalized groups.

How to Apply

When developing or evaluating AI systems, actively recruit individuals from diverse and marginalized communities to test for subtle biases and stereotypes in AI outputs, using their lived experiences as a benchmark.

Limitations

The study focused on a specific dialog model and the disability community; findings may not generalize to all LLMs or other marginalized groups without further research. The definition of 'harm' was participant-defined and nuanced.

Student Guide (IB Design Technology)

Simple Explanation: AI chatbots can sometimes say things that aren't obviously rude but still make people with disabilities feel misunderstood or stereotyped, just like in real life.

Why This Matters: This research highlights that even AI tools designed to be helpful can unintentionally cause harm by reflecting societal biases, which is important for creating responsible and inclusive designs.

Critical Thinking: How can designers proactively identify and mitigate 'subtle biases' in AI systems before they are deployed, beyond simply checking for overtly offensive content?

IA-Ready Paragraph: This research underscores the critical need for user-centered evaluation of AI systems, particularly concerning subtle biases. As demonstrated by Gadiraju et al. (2023), AI models can inadvertently perpetuate harmful stereotypes about disability that mirror lived experiences, even when not overtly offensive. This highlights the importance of involving diverse user groups in the design and testing phases to ensure AI tools are equitable and inclusive.

Project Tips

How to Use in IA

Examiner Tips

Independent Variable: LLM responses to disability-related prompts.

Dependent Variable: Participant perceptions of LLM outputs (e.g., subtle stereotypes, harmfulness).

Controlled Variables: Focus group discussions, participant demographics (disability status).

Strengths

Critical Questions

Extended Essay Application

Source

"I wouldn't say offensive but...": Disability-Centered Perspectives on Large Language Models · 2023 · 10.1145/3593013.3593989