AI Model Safety Rules: The 2026 Framework Guide

A practical guide to the AI model safety rules that matter in 2026 — what each framework requires, how they compare, and what to ask any AI vendor.

AI model safety rules used to be the domain of research labs. In 2026, they are core to enterprise procurement. Three frameworks now drive most programs: NIST AI RMF, ISO/IEC 42001, and the EU AI Act.

Frontier AI labs also publish their own safety policies. Anthropic's Responsible Scaling Policy, OpenAI's Preparedness Framework, and Google DeepMind's Frontier Safety Framework all define what tests a model must pass before release.

This guide compares each framework, lists the controls they share, and gives you a vendor due-diligence checklist you can use today.


What are AI model safety rules?

AI model safety rules are the standards, laws, and voluntary commitments that govern how AI models are tested before release. They cover risks like cyberattack capability, biological and chemical risk, autonomy, and large-scale misuse.

Some rules are binding. The EU AI Act is law in the EU. California SB 53 is law in California. Most others are voluntary — but enterprise buyers increasingly require them.

  • NIST AI RMF — Voluntary US baseline. Risk-based, four core functions: Govern, Map, Measure, Manage.
  • ISO/IEC 42001 — Voluntary international standard. Certifiable AI management system, like ISO 27001.
  • EU AI Act — Binding in the EU. GPAI rules, high-risk system rules, transparency rules.
  • California SB 53 — Binding in California. Frontier-model safety disclosures above 10^26 FLOPs.
  • Federal AI executive order (June 2, 2026) — Voluntary US frontier-model pre-release review.
  • Frontier-lab policies — Voluntary self-commitments by Anthropic, OpenAI, and Google DeepMind.

NIST AI Risk Management Framework (AI RMF 1.0)

The NIST AI Risk Management Framework is the US federal baseline for AI safety. Version 1.0 was published in January 2023. NIST has since added a Generative AI Profile (NIST AI 600-1) with 12 risk categories specific to generative AI.

NIST AI RMF is voluntary. But Colorado SB 24-205 gives it explicit safe-harbor credit, and most enterprise buyers ask for it by name.

  • Four core functions — Govern, Map, Measure, Manage.
  • Generative AI Profile (NIST AI 600-1) — 12 risk categories including CBRN, confabulation, dangerous content, IP, and information security.
  • Sector-specific NIST profiles for healthcare, finance, and government use cases.
  • Maps cleanly to ISO 42001 and the EU AI Act risk-based approach.
  • Safe-harbor credit under Colorado SB 24-205 and most other US state AI laws.
  • Free to adopt — the framework is published openly by NIST.

ISO/IEC 42001 — AI management systems

ISO/IEC 42001 is the international standard for AI management systems. It was published in December 2023 and is the AI equivalent of ISO 27001 (information security).

ISO 42001 is certifiable. That means an accredited auditor can review your program and issue a certificate. Enterprise buyers in the EU and Asia increasingly ask for it.

  • Certifiable management system — third-party audit and certification available.
  • Risk and impact assessment built into the standard.
  • Annex A control set, similar in structure to ISO 27001.
  • Designed to coexist with ISO 27001, ISO 9001, and GDPR programs.
  • Most useful for organizations that already run ISO 27001 — overlapping audit work.
  • Maps to NIST AI RMF and the EU AI Act risk-based requirements.

The EU AI Act and GPAI obligations

The EU AI Act is the world's first broad, binding AI law. It applies to anyone placing an AI system on the EU market — including US companies that sell into Europe.

General-purpose AI (GPAI) model obligations took effect August 2, 2025. The European Commission begins enforcement on August 2, 2026.

  • GPAI obligations apply to providers of large general-purpose AI models.
  • Systemic-risk tier — Models trained above 10^25 FLOPs face additional safety, eval, and cyber obligations.
  • Transparency — Public documentation summary and training-data summary required for GPAI providers.
  • High-risk system rules — Apply to AI in employment, education, lending, law enforcement, and critical infrastructure.
  • Enforcement — Up to 7% of global annual turnover for the most serious violations.
  • Codes of Practice — Industry-co-developed guidance, signed by major AI labs in 2025.
The EU AI Act systemic-risk threshold is 10^25 FLOPs; California SB 53's frontier threshold is 10^26 FLOPs. Frontier labs are designing for both at once.

Frontier-lab safety policies (RSP, Preparedness, FSF)

The largest AI labs publish their own safety policies. These are not law. But they set the de facto bar for what a "responsible" frontier model release looks like.

Most enterprise buyers now ask for the vendor's safety policy — alongside SOC 2 and ISO 27001 — as part of procurement.

  • Anthropic Responsible Scaling Policy (RSP) — AI Safety Levels (ASL) with pre-deployment evals and security controls.
  • OpenAI Preparedness Framework — Tracked capabilities (cyber, CBRN, persuasion, autonomy) with red lines.
  • Google DeepMind Frontier Safety Framework — Critical Capability Levels (CCLs) with deployment thresholds.
  • Common ground — Pre-deployment evals, internal red-teaming, deployment thresholds, and incident response.
  • Voluntary White House commitments (2023) and Seoul AI Summit commitments (2024) underpin most of these policies.
  • These policies are starting to be reflected in California SB 53 and the new federal EO.

The 10 controls that show up in every AI safety framework

Across NIST AI RMF, ISO 42001, the EU AI Act, and frontier-lab policies, the same controls keep showing up. These are the minimum bar you should expect from any AI vendor.

  • Model card or system documentation describing intended use, limits, and training data.
  • Pre-deployment evaluations — including capability evals and safety evals.
  • Independent red-teaming for high-impact models.
  • Documented deployment thresholds — what test results trigger a delay or scope change.
  • Algorithmic-discrimination testing for AI used in consequential decisions.
  • Incident-response plan for safety-relevant issues post-release.
  • Vendor SOC 2 Type II for security baseline.
  • Whistleblower protections for safety-relevant disclosures.
  • Annual impact assessment for high-risk uses.
  • Updates on substantial modification — not just at first release.

AI vendor due-diligence checklist

Below is the checklist Layer3 uses with clients before adopting any AI vendor. It maps directly to the controls above.

  • Do you publish a model card or system documentation?
  • Which AI safety framework do you follow — NIST AI RMF, ISO 42001, your own RSP/Preparedness/FSF?
  • Have you done independent red-teaming? Can you share a summary?
  • How do you handle pre-deployment safety evals — what are your thresholds?
  • Are you covered by California SB 53 or the EU AI Act GPAI rules? If so, share your disclosures.
  • Do you participate in the federal voluntary pre-release review under the June 2, 2026 EO?
  • What is your incident-response plan for safety-relevant issues?
  • Do you have SOC 2 Type II? ISO 27001? ISO 42001 in progress?
  • What is your training-data documentation under California AB 2013?
  • Will you sign a DPA, model risk addendum, and AI-specific contract language?

Frequently Asked Questions

  • It depends on where you operate and what you build. Most US companies should follow NIST AI RMF as a baseline. Companies serving the EU must comply with the EU AI Act. Companies in California must follow SB 53 (frontier developers) and AB 2013 (any generative AI provider). Companies in Colorado must follow SB 24-205.
  • NIST AI RMF is a voluntary US risk-based framework — free to adopt, no certification. ISO/IEC 42001 is an international AI management system standard that is certifiable by an accredited auditor. Most US companies start with NIST AI RMF; companies serving EU or Asian buyers add ISO 42001 over time.
  • A frontier AI model is one of the most capable AI models, usually trained with very large compute budgets. California SB 53 defines its threshold as 10^26 FLOPs of training compute. The EU AI Act systemic-risk threshold for GPAI is 10^25 FLOPs. The June 2, 2026 federal EO asks federal agencies to define "covered frontier models" based on cyber capability benchmarks.
  • Not directly — they are voluntary self-commitments by labs like Anthropic, OpenAI, and Google DeepMind. But if you buy a frontier model, you should ask your vendor which policy they follow and how it affected the release of the model you are using. It is now a standard procurement question.
  • The June 2, 2026 AI executive order asks frontier AI developers to share their models with the government 30 days before public release, on a voluntary basis. It also tells federal agencies to build new AI cyber-capability benchmarks. It does not create a mandatory licensing or permitting requirement.
  • Ask for their model card, the AI safety framework they follow (NIST AI RMF, ISO 42001, RSP/Preparedness/FSF), whether they have done independent red-teaming, their pre-deployment evaluation thresholds, their SOC 2 status, and whether they are subject to California SB 53 or the EU AI Act. Layer3's vendor due-diligence checklist covers all of these in one document.

Get a vendor due-diligence pack for any AI tool

Layer3 Labs builds vendor due-diligence packs that cover NIST AI RMF, ISO 42001, the EU AI Act, and state AI laws — so you can buy AI tools faster without taking on hidden risk.

Book a free AI vendor diligence call