Ethical and Legal Considerations

SL and HL. Booklet section: Ethical and legal considerations. This is the cross-theme ethics strand of Theme A, and it is one of the two SL+HL challenges. It is most likely to be examined with Discuss, Evaluate, or To what extent, all of which reward arguing more than one side.

Ethics is not an add-on to this case study; it is half of the marks-bearing analysis. Visionary Studios is a business making images for clients, so its choices have legal and ethical consequences. The booklet names three considerations, and the skill being tested is judgement: you are expected to argue more than one side and reach a defended position, not to list dangers.

1. Dataset curation and intellectual property
2. Bias and fairness
3. Transparency and AI disclosure
Arguing more than one side
A wider lens (research extension)
Bringing evidence
Quick check
Practice exercises
1. Core
2. Extension
3. Challenge
Connections

1. Dataset curation and intellectual property

Generative models are trained on large image datasets, and where those images come from is a legal question. Dataset curation is the careful selection, cleaning, and organising of training data. For Visionary Studios the central issue is intellectual property (IP): training data must avoid copyrighted material so the company complies with the law and does not reproduce work that belongs to someone else.

This is a live, unsettled debate in the real world. Many image generators were trained on images scraped from the internet, and there is a genuine argument about who owns that content and whether using it to train a commercial product is fair. Artists have raised the concern that a model trained on their work can imitate their style without permission or payment. The honest exam position is that this is contested: there is a case that public images are fair to learn from, and a case that creators deserve consent and compensation. A studio acting responsibly leans toward licensed or properly cleared datasets rather than scraped ones.

Scenario. Visionary Studios curates a dataset of licensed and original images so its generator does not reproduce a competitor’s protected artwork or a named artist’s signature style.

Research this. Real copyright disputes over training data are exactly the kind of independent research the markband rewards. Find a current example, describe what is being argued, and bring it in as evidence. Cite it; do not invent the detail.

2. Bias and fairness

A model learns the patterns in its training data, including the unfair ones. Bias and fairness is the requirement to assess datasets for biases that would cause exclusionary or inaccurate representations in the generated images. If a dataset over-represents one group, style, or context, the model’s output will skew the same way, and the people or subjects left out are represented poorly or not at all.

It helps to separate two ideas that are easy to confuse. Technical bias is a measurable skew in the data or the model (one category appears far more often than another). The ethical problem is the harm that skew causes: real people misrepresented, excluded, or stereotyped. A studio reducing bias has to do both, audit the dataset for skew (bias mitigation) and judge the real-world fairness of what the model produces.

Scenario. Visionary Studios checks that its training images represent a genuine range of people and settings, so that when a brief asks for “a customer” the generator does not default to a single narrow stereotype.

3. Transparency and AI disclosure

Transparency and AI disclosure is about being open that content is AI-generated. The booklet’s framing is practical: a company should set clear guidelines for disclosing AI-generated work, in order to keep the trust of clients and audiences. If an audience later discovers that an “authentic” campaign was machine-generated and this was hidden, the loss of trust is the real cost.

This connects to wider issues worth researching: deepfakes and AI-generated misinformation make disclosure a public concern, not just a courtesy, and approaches such as content-provenance or content-credential labelling are emerging to mark how an image was made. A point examiners reward is that disclosure guidelines cannot be static: as the technology and its misuses change, the guidelines have to be revisited.

Scenario. Visionary Studios adopts a policy of labelling AI-generated campaign images for clients, and agrees with each client how that is communicated to the public.

flowchart LR
    DATA["Training data"] --> MODEL["Generative model"] --> OUT["Generated images"] --> USE["Shared with audience"]
    IP["Dataset curation and IP"] -.-> DATA
    BIAS["Bias and fairness"] -.-> DATA
    DISC["Transparency and disclosure"] -.-> USE

Where each consideration applies along the pipeline: intellectual property and bias are dataset issues; disclosure is about how the output is shared. (Original diagram.)

Arguing more than one side

Every one of these considerations has a real tension, and that is what makes them good exam questions. Generative AI gives Visionary Studios genuine creative power, speed, and cost savings; it also carries IP risk, bias risk, and trust risk. A top-band answer does not pretend one side does not exist. It states the benefit, states the cost, weighs them for this specific studio, and concludes.

Consideration	The pull one way	The pull the other way
Dataset curation / IP	Large, scraped datasets are cheap and powerful	They risk infringing creators’ rights
Bias and fairness	More data generally improves a model	Unaudited data encodes and amplifies bias
Transparency / disclosure	Not disclosing can look slicker and simpler	Hidden AI use destroys trust when found out

A wider lens (research extension)

The booklet names three considerations, and those are what the questions are built on. Beyond them, a strong researcher can bring in related issues as clearly-labelled extensions, not as booklet content. The most defensible is environmental impact: training and running large models consumes significant energy, which ties back to the computational cost of diffusion models. If you raise it, frame it as a wider ethical angle you researched, and connect it to a point the booklet does make (compute cost), rather than presenting it as one of the booklet’s named considerations.

Bringing evidence

The markband separates a generic answer from a researched one. To clear the research bar, point to concrete, real material and use it accurately:

Real tools: DALL-E, Stable Diffusion, Midjourney, and how they handle training data and disclosure.
Real disputes: documented copyright cases over training images, and documented examples of biased image generators.
Real responses: licensing of training data, dataset audits, and content-provenance or labelling standards.

Do not fabricate. Made-up statistics, fake citations, and invented “facts” are worse than no example. If you are not sure of a detail, describe the issue accurately and say it is an area you would verify. Examiners reward genuine, accurate research, not confident-sounding invention.

Quick check

Q1. Which of these is not one of the booklet's three ethical and legal considerations?

Q2. Why is the source of a model's training images a legal concern for Visionary Studios?

Q3. A generator trained on an unbalanced dataset keeps producing the same narrow stereotype for "a person." This is an example of:

Q4. Why does the booklet treat disclosing AI-generated content as an ethical issue?

Q5. What does a top-band answer to an ethics question in this case study need?

Practice exercises

The ethics challenge is examined with synthesis command terms. Several of these are deliberately prose-only Discuss prompts, which is how this material is assessed.

Core

Outline (4 marks) - Outline two ethical or legal considerations Visionary Studios must address when building its training datasets.
Describe (3 marks) - Describe what transparency and AI disclosure means for the studio’s relationship with its clients.

Extension

Explain (4 marks) - Explain how bias can enter a generative model through its training data, and one way the studio could reduce it. Write in prose, with no diagram.
Discuss (6 marks) - Discuss whether Visionary Studios should train its generator on images scraped from the internet. Argue both sides and reach a conclusion.

Challenge

To what extent (8 marks) - To what extent do the ethical and legal risks of generative AI outweigh its benefits for a commercial design studio? Cover intellectual property, bias, and disclosure, bring in real research, and reach a calibrated conclusion.

Connections

Previous: Evaluating generative AI models - the technical factors that sit alongside ethics.
Next: Glossary trainer - lock in the terminology examiners look for.
Course link: Ethics of Machine Learning - the wider A4 ethics, including the Discuss command term in depth.

Ethical and Legal Considerations

Table of Contents

1. Dataset curation and intellectual property

2. Bias and fairness

3. Transparency and AI disclosure

Arguing more than one side

A wider lens (research extension)

Bringing evidence

Quick check

Practice exercises

Core

Extension

Challenge

Connections