---
name: tai-ch027-the-new-ai-quality-skillset
description: 'Apply chapter 27 of Testing AI, The New AI Quality Skillset, as a workflow for evaluating AI and non-deterministic systems. Use for test planning, eval design, quality review, release evidence, examples, or coaching related to the new ai quality skillset.'
---

# The New AI Quality Skillset

Skill name: `tai-ch027-the-new-ai-quality-skillset`

Based on **Testing AI: Engineering Confidence in AI Systems** by **Jason Arbon**.

## Purpose

The future AI builder is a rubric designer, sampling strategist, AI judge operator, risk
analyst, and statistical storyteller.

## Use This Workflow

- Identify the AI behavior or release decision being evaluated.
- Define realistic cases, slices, unacceptable outcomes, and evidence needed for confidence.
- Choose measurements that match the risk: rubric scores, samples, intervals, traces, human review, deterministic checks, or production monitors.
- Report uncertainty, severe failures, and decision impact instead of only a pass/fail result.

## Key Guidance

The new AI quality skillset combines product judgment, evaluation design, sampling strategy, AI-
assisted review, risk analysis, and statistical storytelling. The work becomes more strategic
because the systems are less predictable. For example, a developer may design a rubric in the
morning, calibrate an LLM judge at noon, analyze confidence intervals in the afternoon, and
explain a release recommendation to leadership by the end of the day.

## Apply The Approach

Create representative cases, score them with explicit criteria, review severe failures separately, report uncertainty, and connect the evidence to a concrete decision.

## Expert Notes

At expert level, the strongest AI builders become evaluation architects. They design systems
that continuously measure quality, generate useful failure evidence, improve test assets from
production learning, and make uncertainty understandable to non-statisticians.
