Week 1 of 12 · Part A — Applied Safety

Synthesis + the Reflection Ritual

Locking in Week 1 — and the habit you'll carry for the next 55 days

Day 5 ~50 minutes Review

Day 5 of 60

What you now hold

One week in, you have the field's scaffolding: safety is engineering, capability isn't safety, risks come in three types (misuse / accident / systemic), every objective gets gamed, and the way to make priorities explicit is a ranked threat model. That's the lens for everything that follows.

The through-line of Week 1

Safety is the disciplined practice of finding, ranking, and reducing a system's failures — across robustness, monitoring, alignment, and systemic context. You don't worry about AI; you threat-model it.

The Reflection Ritual

From here on, most days end with a real judgment call — is this output a violation? is this eval honest? is this risk acceptable? You'll be tempted to guess and move on. Don't. Adopt this ritual now and run it every time a call is genuinely ambiguous:

The Ritual

1 · Check the written standard

Is there a policy, taxonomy, or spec that already decides this? If so, the call isn't yours to improvise — apply it.

2 · Check precedent

How were similar cases ruled before? Consistency across cases is itself a safety property; it's what makes a system's behavior predictable.

3 · If still unclear — don't guess silently

Document the ambiguity precisely, make a provisional call, and escalate it for adjudication. Edge cases are where safety policy is actually written. A flagged ambiguity is a gift to the standard; a silent guess is a hole nobody can find later.

Why this is the whole job in miniature

Safety at scale is mostly the management of ambiguous calls — and the discipline of turning each one into an improvement to the written standard. Run this ritual and your disagreements become your roadmap instead of your noise. Every day after Week 1 will remind you of it.

Self-quiz — can you do these without notes?

Prove the Week

~50 minutes

  1. Define misuse, accident, and systemic risk — each in one sentence with an example.
  2. Explain specification gaming and why it worsens with capability.
  3. Name the four pillars (robustness, monitoring, alignment, systemic) and which weeks of this track cover each.
  4. From memory, list the four parts of a threat model.
  5. Write your Week 1 summary in your own words (a paragraph), and the one question you most want answered by Week 12.
The expert move

A practitioner treats ambiguity as a threat to be quietly resolved. An expert treats it as signal: every hard call is a precise pointer to a gap in the written standard, and closing those gaps is the highest-leverage work there is. Owning the loop that turns ambiguity into better policy is what scales one person's judgment into a whole system's reliability.

Say this in an interview: "I don't quietly guess on edge cases — I document the ambiguity, make a provisional call, and feed it back into the policy. Disagreement is data, and the hardest cases are exactly where the real standard gets written."

Week 1 Takeaways