Robustness, Security, Adversarial Testing, and AI-Specific Threats
Teach practical testing for AI-specific robustness and security threats, including poisoning, evasion, extraction, prompt injection, and confidentiality attacks.
Robustness, Security, Adversarial Testing, and AI-Specific Threats video briefing
A focused explanation of chapter 8, turning the AI testing theory into concrete validation checks.
Briefing focus
Module opening
This is a structured lesson briefing. Real video/audio can be added later as a media source.
Estimated time
9 min
- 1Module opening
- 2Learning objectives
- 3Mind map
- 4Scenario evidence breakdown
Transcript brief
Teach practical testing for AI-specific robustness and security threats, including poisoning, evasion, extraction, prompt injection, and confidentiality attacks. The briefing explains why the topic matters, walks through a failure scenario, and identifies the artefacts a tester should produce for evidence and auditability.
Key takeaways
- Connect the AI risk to a measurable test or monitor.
- Document the evidence needed for reproducibility and audit.
- Use the lab or scenario to practise the validation workflow.
Module opening
Teach practical testing for AI-specific robustness and security threats, including poisoning, evasion, extraction, prompt injection, and confidentiality attacks.
Audience. QA, security-minded testers, and test leads responsible for resilient AI systems.
Why this matters. AI systems introduce new attack surfaces and fragility. Testers need to think about accidental variation and deliberate abuse.
ISTQB CT-AI mapping. CT-AI 7.6, 9.1, 10.1
Trainer note
Start with the scenario before the theory. Ask learners what evidence would make them confident, then use the module to build that evidence step by step.
Learning objectives
- Explain the core quality risk in robustness, security, adversarial testing, and ai-specific threats.
- Select practical test evidence that supports an AI release decision.
- Apply the module concepts to a realistic QA scenario.
- Produce a portfolio artifact that can be reused in a professional AI testing context.
Mind map
Real-life scenario · Logistics automation
The image classifier fooled by a small sticker
Situation. A vision model classifies parcel labels and handling requirements. A small visual perturbation caused fragile predictions for hazardous handling labels.
Lesson. AI testing is strongest when risks, examples, evidence, and release decisions are connected.
Scenario evidence breakdown
| Scenario element | Detail |
|---|---|
| Product/System | Parcel sorting system |
| AI feature | A vision model classifies parcel labels and handling requirements. |
| Failure or risk | A small visual perturbation caused fragile predictions for hazardous handling labels. |
| Testing challenge | Standard clean-image accuracy did not reveal robustness under realistic noise, damage, camera angle, or adversarial input. |
| Tester response | The tester built a threat model, perturbation suite, attack success metric, fallback workflow, and monitoring for abnormal confidence patterns. |
| Evidence required | Threat model, robustness test report, adversarial examples, mitigation backlog, and incident playbook. |
| Business decision | Approve only for lanes where fallback scanning and manual review controls reduce harm. |
Visual flow
Learning path
Start Here
5 minOutcome, CT-AI exam relevance, and the parcel classifier scenario.
Learn
24 minRobustness, threat modelling, evasion, poisoning, prompt injection, and resilience controls.
See It
10 minAttack success and fallback evidence for hazardous labels.
Try It
18 minBuild a threat model and robustness report.
Recall and Apply
10 minExam traps, active recall, and the portfolio artifact.
Robustness is release evidence
Robustness testing checks whether AI behaviour remains acceptable under realistic variation and plausible abuse, not just clean lab inputs.
Example
A small sticker or damaged label caused fragile predictions for hazardous parcel handling.
Mistake
Reporting clean-image accuracy without perturbation, abuse-case, fallback, or residual-risk evidence.
Evidence
Threat model, perturbation suite, attack success rate, fallback workflow, control matrix, monitoring alerts, and incident playbook.
Worked example: Limiting release after robustness failure
Scenario. A parcel classifier works on clean images but misclassifies hazardous labels after realistic smudges, stickers, and camera-angle changes.
Reasoning. The risk is high-impact and operational. Release can only be considered where fallback scanning, confidence thresholds, and manual review reduce harm.
Model answer. Approve only for constrained lanes with tested fallback controls; block broader release until robustness thresholds and mitigation evidence pass.
Try it: Build the threat model and robustness report
Prompt. Use the parcel classifier scenario to define threats, perturbations, controls, and release conditions.
Learner action. Name attacker or variation source, target asset, access path, test cases, success metric, mitigation, monitoring, owner, and residual risk.
Expected output. `ai-threat-model-and-robustness-report.md` with abuse cases, robustness results, controls, incident playbook, and release recommendation.
Exam trap
Objective
CT-AI 7.6, 9.1, 10.1
Common trap
Treating robustness and security as separate from model quality or data lifecycle.
Wording clue
Look for answers that link attack path, test evidence, mitigation, residual risk, and release action.
Portfolio checkpoint
Create the module portfolio deliverable and use it to support your release decision.
Artifact structure
ai-threat-model-and-robustness-report.md
Recall check
- What is an evasion attack?
- Manipulating runtime inputs to cause an incorrect model output.
- Why test realistic perturbations?
- Clean lab inputs can hide failures under noise, damage, spelling errors, lighting, or angle changes.
- What makes a robustness finding release-relevant?
- It has severity, attack success or degradation evidence, mitigation, owner, and release action.
- What portfolio artifact does this module produce?
- ai-threat-model-and-robustness-report.md, a threat and robustness evidence report.
Topic-by-topic teaching guide
1. Robustness
Robustness is stable behaviour under expected variation such as noise, missing fields, spelling errors, or lighting changes.
| Teaching lens | Practical detail |
|---|---|
| Real QA example | A support classifier should still understand common typos and formatting differences. |
| What can go wrong | Testing only perfect lab inputs. |
| How a tester should think | Perturb realistic inputs and measure stability. |
| Evidence to collect | Robustness suite and degradation thresholds. |
2. Threat Modelling
AI threat modelling names the attacker, goal, access, target, and control points.
| Teaching lens | Practical detail |
|---|---|
| Real QA example | A competitor may query an API to infer model behaviour, while a user may attempt prompt injection. |
| What can go wrong | Listing threats without capability or testable scenario. |
| How a tester should think | Turn threats into executable tests and controls. |
| Evidence to collect | Threat model and abuse case catalogue. |
3. Evasion and Poisoning
Evasion manipulates inputs at inference time; poisoning corrupts training data or feedback loops.
| Teaching lens | Practical detail |
|---|---|
| Real QA example | A malicious review campaign can shift a recommendation system if feedback is trusted blindly. |
| What can go wrong | Treating security as unrelated to data and model lifecycle. |
| How a tester should think | Test both runtime and training-time attack paths. |
| Evidence to collect | Attack simulation report and data control checks. |
4. Prompt Injection and LLM Threats
LLM applications can be attacked through user text, retrieved documents, tool outputs, or hidden instructions.
| Teaching lens | Practical detail |
|---|---|
| Real QA example | A retrieved page tells the assistant to ignore policy and reveal private data. |
| What can go wrong | Only testing friendly prompts. |
| How a tester should think | Red-team instructions, retrieval content, and tool boundaries. |
| Evidence to collect | Prompt injection tests and tool permission evidence. |
5. Resilience Controls
Controls include validation, rate limits, human fallback, monitoring, isolation, and rollback.
| Teaching lens | Practical detail |
|---|---|
| Real QA example | Low-confidence hazardous parcel predictions go to manual review. |
| What can go wrong | Finding vulnerabilities without defining release action. |
| How a tester should think | Link each risk to a control and residual risk decision. |
| Evidence to collect | Control matrix, monitoring alerts, and incident playbook. |
Practical QA workflow
- Start from the user or business decision affected by the AI system.
- Name the AI asset under test: data, feature pipeline, model, prompt, retrieval index, tool, or full workflow.
- Convert the main risk into observable quality signals and release gates.
- Choose the right oracle: deterministic assertion, metric threshold, metamorphic relation, reviewer rubric, comparison, or production monitor.
- Test important slices, edge cases, misuse cases, and change scenarios.
- Record versions, data sources, thresholds, reviewer notes, and decision rationale.
Test design checklist
- What harm could happen if this AI behaviour is wrong?
- Which users, groups, products, regions, or workflows need separate evidence?
- Which metric or observation would reveal the failure early?
- What is the minimum evidence needed for release, shadow mode, rollback, or rejection?
- Who owns the evidence after the model, prompt, or data changes?
Worked QA example
A tester receives a release request for the module scenario. Instead of asking only whether tests pass, the tester writes three release questions: what changed, who could be harmed, and what evidence proves the change is controlled. The answer becomes a small evidence pack: one risk table, one set of representative examples, one automated or reviewable check, and one release recommendation.
Common mistakes
- Treating AI output as a normal deterministic response when the real risk is behavioural.
- Reporting one impressive metric without slices, uncertainty, or business context.
- Forgetting that data, prompts, model versions, and monitoring are part of the test surface.
- Writing governance language that cannot be checked by a tester.
Guided exercise
Use the scenario above and create a one-page evidence plan. Include the decision being influenced, the main risk, the test oracle, the data or examples required, the release gate, and the owner.
Discussion prompt
Which is more likely in your domain: accidental messy input, malicious input, poisoned feedback, or prompt injection?
Hands-on lab mapping
- Lab: CourseMaterials/AI-Testing/labs/05_adversarial_attacks_art.ipynb
- Task: Run a simple adversarial robustness experiment and document attack success and mitigation options.
- Why this lab matters: it turns the module theory into visible evidence that a release approver can inspect.
Decision simulation
Robustness drops sharply under realistic noisy inputs. Decide whether to release with fallback controls or block for model improvement.
Key terms
- Evasion attack: Manipulating inputs against a deployed model.
- Data poisoning: Corrupting training data or feedback to influence future behaviour.
- Prompt injection: Text designed to override or bypass intended LLM instructions.
- Attack success rate: Proportion of attack attempts that achieve the attacker goal.
Revision prompts
- Explain the module scenario in two minutes to a product owner.
- Name three pieces of evidence you would require before release.
- Identify one automated check and one human-review check.
- Describe how this topic changes after deployment.