Chapter 8 of 10

Robustness, Security, Adversarial Testing, and AI-Specific Threats

Teach practical testing for AI-specific robustness and security threats, including poisoning, evasion, extraction, prompt injection, and confidentiality attacks.

45 min guide5 reference questions folded into the guide material

Guided briefing

Robustness, Security, Adversarial Testing, and AI-Specific Threats video briefing

A focused explanation of chapter 8, turning the AI testing theory into concrete validation checks.

Briefing focus

Module opening

This is a structured lesson briefing. Real video/audio can be added later as a media source.

Estimated time

9 min

1Module opening
2Learning objectives
3Mind map
4Scenario evidence breakdown

Transcript brief

Teach practical testing for AI-specific robustness and security threats, including poisoning, evasion, extraction, prompt injection, and confidentiality attacks. The briefing explains why the topic matters, walks through a failure scenario, and identifies the artefacts a tester should produce for evidence and auditability.

Key takeaways

Connect the AI risk to a measurable test or monitor.
Document the evidence needed for reproducibility and audit.
Use the lab or scenario to practise the validation workflow.

Module opening

Teach practical testing for AI-specific robustness and security threats, including poisoning, evasion, extraction, prompt injection, and confidentiality attacks.

Audience. QA, security-minded testers, and test leads responsible for resilient AI systems.

Why this matters. AI systems introduce new attack surfaces and fragility. Testers need to think about accidental variation and deliberate abuse.

ISTQB CT-AI mapping. CT-AI 7.6, 9.1, 10.1

Trainer note

Start with the scenario before the theory. Ask learners what evidence would make them confident, then use the module to build that evidence step by step.

Learning objectives

Explain the core quality risk in robustness, security, adversarial testing, and ai-specific threats.
Select practical test evidence that supports an AI release decision.
Apply the module concepts to a realistic QA scenario.
Produce a portfolio artifact that can be reused in a professional AI testing context.

Mind map

Real-life scenario · Logistics automation

The image classifier fooled by a small sticker

Situation. A vision model classifies parcel labels and handling requirements. A small visual perturbation caused fragile predictions for hazardous handling labels.

Lesson. AI testing is strongest when risks, examples, evidence, and release decisions are connected.

Scenario evidence breakdown

Scenario element	Detail
Product/System	Parcel sorting system
AI feature	A vision model classifies parcel labels and handling requirements.
Failure or risk	A small visual perturbation caused fragile predictions for hazardous handling labels.
Testing challenge	Standard clean-image accuracy did not reveal robustness under realistic noise, damage, camera angle, or adversarial input.
Tester response	The tester built a threat model, perturbation suite, attack success metric, fallback workflow, and monitoring for abnormal confidence patterns.
Evidence required	Threat model, robustness test report, adversarial examples, mitigation backlog, and incident playbook.
Business decision	Approve only for lanes where fallback scanning and manual review controls reduce harm.

Visual flow

Robustness, Security, Adversarial Testing, and AI-Specific Threats scenario flow

Learning path

Start Here
5 min
Outcome, CT-AI exam relevance, and the parcel classifier scenario.
Learn
24 min
Robustness, threat modelling, evasion, poisoning, prompt injection, and resilience controls.
See It
10 min
Attack success and fallback evidence for hazardous labels.
Try It
18 min
Build a threat model and robustness report.
Recall and Apply
10 min
Exam traps, active recall, and the portfolio artifact.

Robustness is release evidence

Robustness testing checks whether AI behaviour remains acceptable under realistic variation and plausible abuse, not just clean lab inputs.

Example

A small sticker or damaged label caused fragile predictions for hazardous parcel handling.

Mistake

Reporting clean-image accuracy without perturbation, abuse-case, fallback, or residual-risk evidence.

Evidence

Threat model, perturbation suite, attack success rate, fallback workflow, control matrix, monitoring alerts, and incident playbook.

Worked example: Limiting release after robustness failure

Scenario. A parcel classifier works on clean images but misclassifies hazardous labels after realistic smudges, stickers, and camera-angle changes.

Reasoning. The risk is high-impact and operational. Release can only be considered where fallback scanning, confidence thresholds, and manual review reduce harm.

Model answer. Approve only for constrained lanes with tested fallback controls; block broader release until robustness thresholds and mitigation evidence pass.

Try it: Build the threat model and robustness report

Prompt. Use the parcel classifier scenario to define threats, perturbations, controls, and release conditions.

Learner action. Name attacker or variation source, target asset, access path, test cases, success metric, mitigation, monitoring, owner, and residual risk.

Expected output. `ai-threat-model-and-robustness-report.md` with abuse cases, robustness results, controls, incident playbook, and release recommendation.

Exam trap

Objective

CT-AI 7.6, 9.1, 10.1

Common trap

Treating robustness and security as separate from model quality or data lifecycle.

Wording clue

Look for answers that link attack path, test evidence, mitigation, residual risk, and release action.

Portfolio checkpoint

Create the module portfolio deliverable and use it to support your release decision.

Artifact structure

ai-threat-model-and-robustness-report.md

ContextThreatsPerturbationsAttack resultsControlsMonitoringResidual riskRecommendation

Recall check

What is an evasion attack?: Manipulating runtime inputs to cause an incorrect model output.
Why test realistic perturbations?: Clean lab inputs can hide failures under noise, damage, spelling errors, lighting, or angle changes.
What makes a robustness finding release-relevant?: It has severity, attack success or degradation evidence, mitigation, owner, and release action.
What portfolio artifact does this module produce?: ai-threat-model-and-robustness-report.md, a threat and robustness evidence report.

Topic-by-topic teaching guide

1. Robustness

Robustness is stable behaviour under expected variation such as noise, missing fields, spelling errors, or lighting changes.

Teaching lens	Practical detail
Real QA example	A support classifier should still understand common typos and formatting differences.
What can go wrong	Testing only perfect lab inputs.
How a tester should think	Perturb realistic inputs and measure stability.
Evidence to collect	Robustness suite and degradation thresholds.

2. Threat Modelling

AI threat modelling names the attacker, goal, access, target, and control points.

Teaching lens	Practical detail
Real QA example	A competitor may query an API to infer model behaviour, while a user may attempt prompt injection.
What can go wrong	Listing threats without capability or testable scenario.
How a tester should think	Turn threats into executable tests and controls.
Evidence to collect	Threat model and abuse case catalogue.

3. Evasion and Poisoning

Evasion manipulates inputs at inference time; poisoning corrupts training data or feedback loops.

Teaching lens	Practical detail
Real QA example	A malicious review campaign can shift a recommendation system if feedback is trusted blindly.
What can go wrong	Treating security as unrelated to data and model lifecycle.
How a tester should think	Test both runtime and training-time attack paths.
Evidence to collect	Attack simulation report and data control checks.

4. Prompt Injection and LLM Threats

LLM applications can be attacked through user text, retrieved documents, tool outputs, or hidden instructions.

Teaching lens	Practical detail
Real QA example	A retrieved page tells the assistant to ignore policy and reveal private data.
What can go wrong	Only testing friendly prompts.
How a tester should think	Red-team instructions, retrieval content, and tool boundaries.
Evidence to collect	Prompt injection tests and tool permission evidence.

5. Resilience Controls

Controls include validation, rate limits, human fallback, monitoring, isolation, and rollback.

Teaching lens	Practical detail
Real QA example	Low-confidence hazardous parcel predictions go to manual review.
What can go wrong	Finding vulnerabilities without defining release action.
How a tester should think	Link each risk to a control and residual risk decision.
Evidence to collect	Control matrix, monitoring alerts, and incident playbook.

Practical QA workflow

Start from the user or business decision affected by the AI system.
Name the AI asset under test: data, feature pipeline, model, prompt, retrieval index, tool, or full workflow.
Convert the main risk into observable quality signals and release gates.
Choose the right oracle: deterministic assertion, metric threshold, metamorphic relation, reviewer rubric, comparison, or production monitor.
Test important slices, edge cases, misuse cases, and change scenarios.
Record versions, data sources, thresholds, reviewer notes, and decision rationale.

Test design checklist

What harm could happen if this AI behaviour is wrong?
Which users, groups, products, regions, or workflows need separate evidence?
Which metric or observation would reveal the failure early?
What is the minimum evidence needed for release, shadow mode, rollback, or rejection?
Who owns the evidence after the model, prompt, or data changes?

Worked QA example

A tester receives a release request for the module scenario. Instead of asking only whether tests pass, the tester writes three release questions: what changed, who could be harmed, and what evidence proves the change is controlled. The answer becomes a small evidence pack: one risk table, one set of representative examples, one automated or reviewable check, and one release recommendation.

Common mistakes

Treating AI output as a normal deterministic response when the real risk is behavioural.
Reporting one impressive metric without slices, uncertainty, or business context.
Forgetting that data, prompts, model versions, and monitoring are part of the test surface.
Writing governance language that cannot be checked by a tester.

Guided exercise

Use the scenario above and create a one-page evidence plan. Include the decision being influenced, the main risk, the test oracle, the data or examples required, the release gate, and the owner.

Discussion prompt

Which is more likely in your domain: accidental messy input, malicious input, poisoned feedback, or prompt injection?

Hands-on lab mapping

Lab: CourseMaterials/AI-Testing/labs/05_adversarial_attacks_art.ipynb
Task: Run a simple adversarial robustness experiment and document attack success and mitigation options.
Why this lab matters: it turns the module theory into visible evidence that a release approver can inspect.

Decision simulation

Robustness drops sharply under realistic noisy inputs. Decide whether to release with fallback controls or block for model improvement.

Key terms

Evasion attack: Manipulating inputs against a deployed model.
Data poisoning: Corrupting training data or feedback to influence future behaviour.
Prompt injection: Text designed to override or bypass intended LLM instructions.
Attack success rate: Proportion of attack attempts that achieve the attacker goal.

Revision prompts

Explain the module scenario in two minutes to a product owner.
Name three pieces of evidence you would require before release.
Identify one automated check and one human-review check.
Describe how this topic changes after deployment.