DV
P
AI Safety Lab ↗
Model Safety Evaluation · Binder

AI Safety Evaluation Portfolio

Marc Warfield — AI Safety Practitioner

I can run a model safety evaluation end to end — threat-model a deployment, author the policy, red-team it responsibly, measure harm and over-refusal, reason about alignment and interpretability limits, manage a risk register against a recognized framework, and make a defensible go/no-go recommendation — with an artifact behind every one of those verbs.

9
Artifacts
3
Layers
12
Weeks
60
Days
The Evaluation, As One Chain

One deployment, walked from worry to sign-off — the continuous story behind the binder. Each link is backed by an artifact below.

Part A — Applied Safety & Evaluations
Part B — Alignment Research Literacy
Part C — Governance, Policy & Systemic Safety