Lesson 1 — Identifying Potential for Bias
1. Bias in Training Data
Generative AI models learn from the data they are trained on.
If that data is biased, the model will often reproduce those biases.
Example – “CEO” image:
If you ask an image model to “generate a picture of a CEO,” it will often show a middle-aged man in a suit, unless you explicitly say “woman CEO” or specify another gender, race, or culture.
This doesn’t mean the model “believes” CEOs are men; it simply reflects the patterns in its training data.
Example – Recruitment tools:
Some recruitment and CV-screening systems have shown biased behaviour.
For instance, when trained mainly on past CVs of successful male candidates, a model can start to:
- Prefer CVs that look similar (e.g., same universities, same hobbies, same gender patterns).
- Down-rank women or candidates from under-represented groups for technical or leadership roles.
This kind of bias matters because it can silently exclude qualified people without anyone explicitly deciding to discriminate.
Why sometimes we specify attributes in prompts:
To counter these defaults, we sometimes deliberately specify characteristics in the prompt, such as:
- “Generate images of CEOs of different genders and ethnicities.”
- “Write an example profile of a software engineer who uses a wheelchair.”
The goal is not to stereotype, but to force diversity when the model’s default output is narrow.
Key idea:
If we can control the data (how it is collected and labelled) and how we prompt the model, we can reduce bias – but we can’t assume it disappears on its own.
2. Bias Across Models
Not all models behave the same way. Bias can depend on:
a) Cultural and role stereotypes
- Prompt: “Show a doctor and a nurse working together.”
- Many models will show the doctor as a man and the nurse as a woman.
- Prompt: “Leaders meeting to discuss strategy.”
- You might get only men, all in Western business suits, even though leaders in reality can be women, wear traditional Arab dress, Indian clothing, casual attire, etc.
This happens because “doctor,” “leader,” and “manager” in the training data are often linked to specific visual and cultural patterns.
b) Language bias
If a model is trained mostly on native-level English:
- It may struggle with non-native English, dialects, or code-switching.
- It may not represent other languages or cultures fairly.
Mitigation options:
- Choose models that are trained on multilingual and diverse datasets.
- Write prompts that explicitly request diversity (e.g., “examples from different regions and cultures”).
- For some use cases, avoid models that clearly perform poorly outside a narrow language group.
3. Guardrails and Their Impact on Bias
What are guardrails?
Guardrails are safety and policy mechanisms that:
- Filter out explicit, violent, sexual, hateful, or otherwise harmful content.
- Block certain types of requests (e.g., instructions for illegal activities).
- Help keep AI systems within legal, ethical, and practical boundaries.
For example, many hosted AI services (like Azure OpenAI and others) apply content filters and safety systems based on their responsible AI principles. These often aim to prevent:
- Abusive or hateful content
- Harassment
- Certain sensitive topics being answered in unsafe ways
How guardrails can introduce new bias:
Guardrails are designed by humans, with particular values and risk assessments. That means they can also introduce bias. For example:
- A system might heavily down-rank or block any discussion of a controversial topic (e.g., surveillance risks of smart home cameras).
- If that content is filtered too aggressively, users may only see the “positive” marketing narrative and never the privacy or safety concerns.
So guardrails protect users, but they can also:
- Hide important information.
- Reflect the priorities of the company or team that designed them.
What users should do:
- Be aware that some answers may be missing or softened due to guardrails.
- If you need balanced information (including risks), ask for “benefits and risks,” “advantages and disadvantages,” or seek multiple sources, not just one AI system.
- Always stay within the platform’s policies and the law; the goal is not to “break” safety systems but to understand their limits.
4. Bias Introduced Through Prompts
Even if the model and its data are relatively balanced, your prompt can introduce bias.
Example:
- Prompt A: “Show me all the benefits of using the cloud to store and process smart home device data.”
- This asks only for positives. It will likely give a one-sided, optimistic answer.
- Prompt B: “Show me the pros and cons of using the cloud to store and process smart home device data.”
- This invites a more balanced answer, including risks and trade-offs.
If you only ever ask for the “benefits” or only for the “problems,” you are guiding the system into a biased response.
Good practice:
- Before hitting Enter, quickly check if your prompt is one-sided.
- If you want a balanced view, explicitly ask for pros and cons, benefits and risks, or multiple perspectives.
5. Common Types of Bias
Bias in AI can mirror the same types of discrimination we see in society. Some common categories:
- Gender – assuming leaders are male, caregivers are female.
- Race / ethnicity – under-representing or misrepresenting certain groups.
- Disability – ignoring people with disabilities or portraying them only as “patients” or “problems to solve.”
- Age – favouring younger candidates in hiring scenarios, or portraying older people as less tech-savvy.
- Religion – stereotyping certain religions as linked to violence or extremism.
- Language and accent – treating non-native speakers as less competent.
- Nationality and culture – centering Western or Global North perspectives by default.
- Economic status – always showing “success” as luxury lifestyles, ignoring other realities.
- Older adults / seniors – portraying them mainly as dependent or ill instead of active and independent.
When designing prompts, datasets, or applications, we should intentionally look for these patterns and challenge them.
6. How Bias Propagates
Bias can spread and grow in several stages:
- Training data
- If the data already contains biased views, stereotypes, or under-representation, the model will learn them.
- Model training and fine-tuning
- If we add more data that’s unbalanced, the model may become more biased over time.
- User interaction and feedback
- If the system is updated based on user reactions (likes, upvotes, clicks), content that reflects existing social bias can get reinforced.
Important point:
The model itself is not “morally biased.” It does not have beliefs.
Bias comes from:
- The data humans feed into the model.
- The design choices in training and guardrails.
- The prompts and feedback humans provide.
Responsibility:
We should make every reasonable effort to:
- Recognize bias in AI outputs.
- Avoid building or deploying models that reinforce harmful patterns.
- Use AI in ways that respect fairness, inclusion, and human rights.