US Agency Expands Pre-Release AI Safety Testing to Include Major Tech Firms
The United States government is taking a more hands-on approach to artificial intelligence safety. The Center for AI Standards and Innovation (CAISI), an arm of the Department of Commerce, has recently inked agreements with Google DeepMind, Microsoft, and xAI. These pacts grant the agency the authority to evaluate frontier AI models from these organizations—and potentially others—before they are released to the public.
New Agreements Bring Frontier AI Under Federal Scrutiny
According to a statement from CAISI—which operates under the National Institute of Standards and Technology (NIST)—the center will conduct pre-deployment evaluations and targeted research to better assess the capabilities of advanced AI systems and to advance the state of AI security. The three new signatories join Anthropic and OpenAI, which entered similar arrangements nearly two years ago during the Biden administration, when CAISI was known as the US Artificial Intelligence Safety Institute.

Back in August 2024, releases about those earlier agreements indicated that the institute intended to provide feedback to both companies on “potential safety improvements to their models”, working closely with its partners at the UK AI Safety Institute (AISI). The new agreements extend this framework to a broader set of industry leaders.
Pre-Deployment Evaluations: Goals and Methods
The core objective is to catch potential risks before they reach the wider public. Microsoft, in a blog post on Tuesday, described the collaboration as essential for building trust and confidence in advanced AI systems. The company noted that as AI capabilities grow, so too must the rigor of the testing and safeguards that underpin them. CAISI’s work will therefore involve:
- Pre-release testing of frontier models to identify vulnerabilities and unsafe behaviors.
- Ongoing evaluation after deployment to monitor for emerging risks.
- Cross-sector collaboration with industry, academia, and international partners.
This approach aims to create a feedback loop: evaluations inform model improvements, which in turn are tested again before wider release.
Industry Experts Weigh In on Proactive Security
Fritz Jean-Louis, principal cybersecurity advisor at Info-Tech Research Group, welcomed the move. He said the CAISI agreements signal a shift toward proactive security for agentic AI—systems that can act autonomously. By enabling government-led testing of advanced models before and after deployment, the initiative should “help strengthen visibility into autonomous behaviors while accelerating the development of standards to mitigate risks.”

Jean-Louis noted that combining early access, continuous evaluation, and cross-sector collaboration pushes the industry toward security-by-design for increasingly autonomous AI systems. However, he also pointed out potential hurdles, such as how intellectual property would be protected under this framework. Still, he called it “a positive step for the industry.”
Executive Order May Formalize AI Vetting Process
Beyond the CAISI agreements, reports emerged on Wednesday that the White House is preparing an executive order to create a formal vetting system for all new AI models. According to Bloomberg, the directive is taking shape weeks after Anthropic revealed that its breakthrough Mythos model was adept at finding network vulnerabilities and could pose a global cybersecurity risk. The order would establish a mandatory review process, key among them Anthropic’s Mythos.
If enacted, this executive order would represent a significant escalation in federal oversight, moving from voluntary agreements to a regulated framework.
A Shift in Policy Direction
Carmi Levy, an independent technology analyst, observed that the announcement establishing CAISI as the testing ground for frontier AI models is directly linked to the broader policy shift. He noted that “it is patently obvious that this week’s announcement … is directly linked to the” need for robust, independent oversight. The combination of voluntary agreements and potential executive action suggests a two-pronged strategy: immediate cooperation with industry leaders while preparing a stronger regulatory baseline.
In summary, the US government is moving rapidly to ensure that the most advanced AI systems are thoroughly vetted before they can affect millions of users. Through agreements with major tech firms, targeted research, and possible new executive orders, the aim is to build a safety net that evolves with the technology itself.
Related Articles
- Mastering Human Agency in an AI-Driven World: A Practical Guide
- Everything You Need to Know About the April 2026 Google System Updates
- What You Need to Know About Cricut’s Joy 2 makes creating stickers easier f...
- How to Evaluate a National Fuel Reserve Plan: A Step-by-Step Guide
- Rust 1.95.0 Introduces cfg_select! Macro, if-let Guards in Matches, and More
- Microsoft Dominates API Management Market as AI Demands Surge – IDC Names Tech Giant a Leader
- Microsoft Azure API Management Recognized as Leader in IDC MarketScape Report for 2026
- Project Mariner Sunset: 6 Key Insights into Google's AI Browser Agent Shutdown