Week 11 of 12 · Part C — Governance

International Coordination & Institutes

How nations are building shared scientific consensus and public evaluation capacity for frontier models

Day 52 ~60 minutes Concept

Day 52 of 60

Regulation needs a shared picture of the evidence

A law can only be as good as the understanding of risk underneath it. Yesterday's tiers presume someone can say what's actually dangerous. But advanced-AI risk is contested, fast-moving, and spread across labs that don't share everything. So a second layer of governance has emerged alongside the laws: international scientific coordination and national institutes whose job is to build a shared, evidence-based picture and the public capacity to test it.

The thesis

Governance isn't only rules — it's capacity. The ability to evaluate a frontier model, and a consensus on what the evidence says, are public goods that no single company can supply credibly. International reports and national institutes are how states are building that capacity outside the labs being assessed.

The International AI Safety Report

The clearest artifact of this coordination is the International AI Safety Report — a multi-country, expert-authored effort (chaired by Yoshua Bengio) to write down the international scientific consensus on the capabilities and risks of advanced AI, for policymakers. Think of it as an IPCC-style "state of the evidence" document for AI: not advocacy, but a synthesis of what the research community can and cannot currently say.

Core Theory

Why a consensus report matters

It gives regulators a neutral reference point. Instead of each government commissioning its own contested assessment — or trusting a lab's self-report — they can point to a shared scientific baseline. That makes coordinated action possible and harder to dismiss as any one actor's agenda.

What it deliberately is and isn't

It reports the state of evidence, including disagreement and uncertainty. It does not prescribe specific policy. Reading the executive summary teaches you to hold risk claims at the right confidence level — which capabilities are demonstrated, which are speculative, and where experts genuinely disagree.

National safety / security institutes

Alongside the report, countries have stood up dedicated institutes to do the hands-on evaluation work. The UK AI Security Institute (formerly the AI Safety Institute) is the leading example: a government body that runs frontier-model evaluations and publishes its methods. Others have followed, and they increasingly coordinate — sharing evaluation techniques and, in some cases, testing the same models.

This is Part A, scaled to the state

The evaluation craft you built in Weeks 3–5 — red-teaming, safe-refusal scorecards, robustness testing — is exactly what these institutes do, but with government standing and access. When a national institute publishes an evaluation method, it's the same discipline you've been practicing, now operating as public infrastructure.

That connection matters for incentives: when an independent, government-grade evaluator can test your model, "trust us, it's safe" stops being sufficient. Coordination changes what a lab can get away with, because the assessment no longer comes only from inside.

Your work today

Read the Consensus, Browse the Institute

~60 minutes

Read the executive summary of the International AI Safety Report. Note three claims and, for each, how confident the report is.
Browse the UK AI Security Institute research pages. Pick one published evaluation or method and connect it to a technique you practiced in Part A.
Write a few sentences: how does the existence of an independent, government-grade evaluator change a lab's incentives around safety claims?

The expert move

A novice treats "AI risk" as a single settled claim. An expert knows the real artifact is a consensus document that tracks confidence and disagreement, and that the credible evaluations increasingly come from independent national institutes, not just the labs. The altitude jump is seeing governance as the build-out of public capacity — shared evidence and shared evaluators — not merely the writing of rules.

Say this in an interview: "I track the International AI Safety Report as the closest thing we have to a scientific consensus on advanced-AI risk, and I follow the national institutes like the UK's because they're turning evaluation into public infrastructure. Independent, government-grade evals change the incentive: a lab can no longer be the only one grading its own homework."

Today's Takeaways

Governance is capacity, not just rules — shared evidence and the ability to evaluate frontier models.
The International AI Safety Report is an IPCC-style consensus on advanced-AI risk, written for policymakers.
National institutes (e.g., the UK AI Security Institute) run government-grade evaluations and publish methods.
Independent evaluation changes lab incentives: "trust us" stops being enough.