---
name: tai-ch135-training-data-poisoning-and-backdoors
description: 'Apply chapter 135 of Testing AI, Training Data Poisoning and Backdoors, as a workflow for evaluating AI and non-deterministic systems. Use for test planning, eval design, quality review, release evidence, examples, or coaching related to training data poisoning and backdoors.'
---

# Training Data Poisoning and Backdoors

Skill name: `tai-ch135-training-data-poisoning-and-backdoors`

Based on **Testing AI: Engineering Confidence in AI Systems** by **Jason Arbon**.

## Purpose

Bad data can teach a model behavior that only appears when the trigger is right.

## Use This Workflow

- Identify the AI behavior or release decision being evaluated.
- Define realistic cases, slices, unacceptable outcomes, and evidence needed for confidence.
- Choose measurements that match the risk: rubric scores, samples, intervals, traces, human review, deterministic checks, or production monitors.
- Report uncertainty, severe failures, and decision impact instead of only a pass/fail result.

## Key Guidance

Training data poisoning happens when bad examples enter the data pipeline and influence model
behavior. Backdoors are hidden behaviors that activate under specific conditions. In AI systems,
triggers may be words, phrases, file names, domains, images, code patterns, or user identities.

## Apply The Approach

Create representative cases, score them with explicit criteria, review severe failures separately, report uncertainty, and connect the evidence to a concrete decision.

## Expert Notes

At expert level, use data provenance, anomaly detection, trigger sweeps, canary tokens, source
reputation, fine-tune review, RAG document quarantine, and adversarial evals. Also test deletion
and recovery: can the poisoned source be found, removed, and proven inactive?
