Preparing for future AI risks in biology
OpenAI News高级 AI 模型具备快速推动科学发现的能力,这是前沿 AI 模型对人类的众多益处之一。在生物学领域,这些模型已经在帮助科学家识别哪些新药最有可能在人类临床试验中获得成功¹。很快,它们还可能加速新药研发、设计更优疫苗、为可持续燃料创造酶、并发现罕见病的新疗法,从而在医学、公共卫生和环境科学领域开辟更多可能性。
与此同时,这些模型也带来了重要的「双重用途」考量——在促进科学进步的同时,如何维持对有害信息的足够屏障。驱动进步的底层能力(例如对生物数据进行推理、预测化学反应、指导实验室操作等)也可能被滥用,帮助经验不足的人复制生物威胁,或助力技术精湛的行为者制造生物武器。虽然要真正制造生物武器仍需物理实验室和敏感材料,但这些壁垒并非绝对不可逾越。
我们预计,未来的 AI 模型在生物学方面将达到我们“准备度框架”²中所定义的「高」能力水平,因此我们正在采取多管齐下的策略来实施缓解措施。本篇文章涵盖:
- 构建负责任的生物能力推进策略
- 与政府机构及国家实验室等外部领域专家合作
- 训练模型安全处理双重用途的生物学请求
- 构建检测、监控与执行系统
- 与专家开展对抗性红队测试
- 部署安全控制
——
一、我们的策略
在不确定性中负责任地行动至关重要。因此,我们一方面推动 AI 在生物医学研究和生物防御等正面应用,另一方面限制对有害能力的访问。我们侧重预防,坚信在真正生物威胁事件发生前,必须先行建立足够的防护。
未来则需要更深入的专家与政府合作,完善整个生态系统,发现单一组织难以捕捉的问题。我们在每个阶段都征求了外部专家的意见:早期与生物安全、生物武器和生物恐怖主义领域的顶尖学者及研究人员一起,塑造我们的生物威胁模型、能力评估和模型使用政策;在设计缓解措施时,邀请具有硕士和博士背景的生物学家帮助创建和验证评估数据;目前则与领域红队专家合作,用高保真场景测试我们的防护效果。
即使我们持续投资于更多研究(如在“湿实验室”中测试新手在无害任务上的成功率),我们也已着手准备并部署多项缓解措施。同时,我们与美国 CAISI³、英国 AISI⁴ 等政府机构以及洛斯阿拉莫斯国家实验室⁵紧密合作,研究 AI 在实验室环境中的应用,并支持外部团队推进生物安全工具和评估。
我们的能力评估(详见系统卡片)以专家意见为依据,旨在估算模型何时跨越「高」能力门槛。我们承认,这些评估基于对生物武器化路径的诸多难以检验的假设,无法精确预测现实中的滥用风险。但鉴于事关重大,我们要主动采取相应的准备措施。
——
二、强化生物防御
过去两年,我们持续监测模型能力演进;launch 前,根据准备度框架预先降低风险;并通过系统卡片公开我们的进展。其中,我们在前沿模型训练中集成了“准备度评估”,定期对模型能力进行快照测试。
以下是我们已部署或正在部署的主要防御措施(敏感细节已省略):
- 训练模型拒绝或安全应对有害请求
- 对于明确危险或武器化用途的请求,模型被训练为直接拒绝。
- 对于双重用途(如病毒学实验、免疫学、基因工程等),遵循《模型规范》⁶原则,只提供高层次见解,不给出可直接操作的步骤或实验室排错指导。
- 全天候检测系统
- 在所有产品端部署风险探测器,一旦检测到可疑生物相关活动,立即拦截响应并发起自动化及人工复审。
- 监控与执行
- 明确禁止利用我们的产品造成伤害。结合 AI 自动化侦测与人工审核,我们对违规行为采取封禁、暂停账户等措施;对于严重滥用,或通知执法机构。
- 端到端红队测试
- 聘请多支专家红队从头到尾模拟有资源行为者试图绕过防护的全过程,将生物风险领域专家与 AI 安全专家配对,检验系统覆盖面和鲁棒性。
- 安全控制
- 在模型权重层面实施多重防护:访问控制、基础设施加固、出口流量检测、内部监控、威胁情报和内部风险项目,防止高风险权重外泄。
我们的安全与保障委员会⁷审阅了上述策略,且已有初版措施在现行模型(如 o3)中落地,使其在准备度框架中仍低于「高」门槛。我们将在不断部署与实测中持续改进技术与人工流程。
——
三、下一步
尽管我们致力于自家模型安全,其他组织未必采取同等防护,全社会很快将面对广泛可得的 AI 生物能力与日益普及的生命科学合成工具带来的系统性挑战。
- 我们将于今年 7 月举办生物防御峰会,邀请政府研究机构与 NGO,共同探讨双重用途风险、分享进展、探索前沿模型如何加速对抗措施和新疗法研发,深化美英及志同道合国家政府的合作。
- 我们也在制定政策和内容级协议,为经审查的机构开放最有用的模型,助力诊断、对抗措施和新检测方法的开发。
- 我们认为,公私部门应合作强化社会整体的生物防御——包括加强核酸合成筛查⁸、更完善的病原体早期检测系统、关键基础设施防护,以及对生物安全创新的投资。
- 同时,AI 与生物安全研究的互补发展将催生更多初创企业,用企业家的力量应对这些挑战。安全和保障不仅是 AI 模型或产品的特性,也将成为市场上不可或缺的服务与产业。我们将积极参与,推动这一领域发展。
我们期待与全球政府、研究人员和创业者深入合作——既为完善生物安全生态体系做好准备,也为迎接未来令人惊叹的科学突破创造条件。
Advanced AI models have the power to rapidly accelerate scientific discovery, one of the many ways frontier AI models will benefit humanity. In biology, these models are already helping scientists identify which new drugs are most likely to succeed in human trials. Soon, they could also accelerate drug discovery, design better vaccines, create enzymes for sustainable fuels, and uncover new treatments for rare diseases to open up new possibilities across medicine, public health, and environmental science.
At the same time, these models raise important dual-use considerations: enabling scientific advancement while maintaining the barrier to harmful information. The same underlying capabilities driving progress, such as reasoning over biological data, predicting chemical reactions, or guiding lab experiments, could also potentially be misused to help people with minimal expertise to recreate biological threats or assist highly skilled actors in creating bioweapons. Physical access to labs and sensitive materials remains a barrier—however those barriers are not absolute.
We expect that upcoming AI models will reach ‘High’ levels of capability in biology, as measured by our Preparedness Framework*, and we’re taking a multi-pronged approach to put mitigations in place. In this post, we cover:
- Developing a responsible approach to advancing biological capabilities
- Collaborating with external domain experts including government entities and national labs
- Training models to safely handle dual-use biological requests
- Building detection, monitoring, and enforcement systems
- Adversarial red-teaming our mitigations with experts
- Deploying security controls
- What’s ahead
Our approach
We need to act responsibly amid this uncertainty. That’s why we’re leaning in on advancing AI integration for positive use cases like biomedical research and biodefense, while at the same time focusing on limiting access to harmful capabilities. Our approach is focused on prevention—we don’t think it’s acceptable to wait and see whether a bio threat event occurs before deciding on a sufficient level of safeguards.
The future will require deeper expert and government collaboration to strengthen the broader ecosystem and help surface issues that no single organization could catch alone. We’ve consulted with external experts at every stage of this work. Early on, we worked with leading experts on biosecurity, bioweapons, and bioterrorism, as well as academic researchers, to shape our biosecurity threat model, capability assessments, and model and usage policies. As we designed mitigations, human trainers with master’s and PhDs in biology helped create and validate our evaluation data. And now, we’re actively engaging with domain-expert red teamers to test how well our safeguards hold up in practice under high fidelity scenarios.
Even as we invest in further research, such as wet lab uplift studies to assess novices’ success on harmless proxy tasks, we are preparing and implementing mitigations now. We’re also continuing to partner closely with government entities, including the US CAISI and UK AISI. We’ve worked with Los Alamos National Lab to study AI’s role in wet lab settings and support external researchers advancing biosecurity tools and evaluations.
Our capability assessments, including those detailed in our system cards, are informed by expert input and designed to estimate when a model crosses into High thresholds. We recognize these assessments are based on hard-to-test assumptions about the bioweaponization pathways and can’t definitively predict real-world misuse. But given the stakes, we want to be proactive in taking relevant readiness measures.
Strengthening defenses in biology
Over the past two years, we’ve tracked what our models can do as they develop, worked to reduce risks before launch per the Preparedness Framework, and shared our findings openly through system cards so others can follow our progress. As part of this, we’ve built Preparedness evaluations that run during frontier model training to give early and regular snapshots of a model’s capabilities.
We’re sharing how we’re preparing, both what’s already in place and what’s ahead, while holding back sensitive details that could help bad actors get around our safeguards.
- Training the model to refuse or safely respond to harmful requests: Historically, we’ve trained models to refuse dangerous requests. We will continue to do this for requests that are explicitly harmful or enable bioweaponization. For dual use requests (such as virology experiments, immunology, genetic engineering, etc.), we follow the principles outlined in our Model Spec, including avoiding responses that provide actionable steps. We believe that detailed step-by-step instructions and wet lab troubleshooting guidance can be risky in the wrong hands. Our default behavior for the general public will intentionally err on the side of caution, by training models to provide high-level insights that support expert understanding while withholding sufficient detail to prevent novice misuse.
- Always-on detection systems: We’ve deployed robust system-wide monitors across all product surfaces with frontier models to detect risky or suspicious bio-related activity. If it looks unsafe based on our filters, the model response is blocked. This also triggers automated review systems, and human review is initiated when needed.
- Monitoring and enforcement checks: We prohibit use of our products to cause harm, and we enforce our policies when we see misuse. We use the same advanced AI reasoning capabilities to detect biological misuse, combining our automated systems with human reviewers to monitor and enforce our policies. Misuse can result in suspension of accounts. We take misuse related to biological risk seriously and may conduct additional investigation into the user and, in egregious cases, we may notify relevant law enforcement. You can read more about our moderation practices here.
- End-to-end red teaming: We are working with multiple teams of expert red teamers; people who try to break our safety mitigations. Their job is to try to bypass all of our defenses by working end-to-end, just like a determined and well-resourced adversary might. This helps us identify gaps early and strengthen the full system. Red-teaming in the biology domain comes with its own challenges. Most expert red teamers lack biorisk domain expertise and may not be able to judge the harmfulness of model output. Most domain experts in biology are not experienced in exploiting model vulnerabilities. We are engaging with both groups to test different aspects of our system from risk coverage to robustness, and pairing them up in teams for the most sophisticated red teaming.
- Security controls: We take a defense-in-depth approach to protecting our model weights, relying on a combination of access control, infrastructure hardening, egress controls, and monitoring. We leverage purpose-built detections and controls to mitigate the risk of exfiltration of high-risk model weights. We complement these measures with always-on Detection & Response, dedicated Threat Intelligence, and an Insider-Risk program ensuring emerging threats are identified and blocked quickly.
Our Board’s Safety and Security Committee has reviewed our approach, and we’ve already rolled out initial versions of this end-to-end mitigation plan in many current models, like o3, which remain below the High capability threshold in our Preparedness Framework. Through this process, we have used the learnings we gained through our deployments to significantly improve the performance of our technical systems and work out the kinks in our human review workflows. We will continue to make changes as we learn more.
What’s ahead
While we’re focused on securing our own models, we recognize that not all organizations will take the same precautions, and the world may soon face the broader challenge of widely accessible AI bio capabilities coupled with increasingly available life-sciences synthesis tools.
We’re hosting a biodefense summit this July, bringing together a select group of government researchers and NGOs to explore dual-use risks, share progress, and explore how our frontier models can accelerate research. Our goal is to deepen our partnerships with the U.S. and aligned governments, and to better understand how advanced AI can support cutting edge biodefense work, from countermeasures to novel therapies, and strengthen collaboration across the ecosystem.
While our safety work aims to limit broad misuse, we’re also developing policy and content-level protocols to grant vetted-institutions access to maximally helpful models so they can advance biological sciences. That includes partnerships to develop diagnostics, countermeasures, and novel testing methods.
Building off of our safety work with governments, we believe the public and private sectors should work together to strengthen our society’s biological defenses outside of AI models. This could include strengthened nucleic acid synthesis screening (building on the recent Executive Order), more robust early detection systems for novel pathogens, hardening infrastructure against biothreats, and investing in biosecurity innovations to help ensure long-term resilience against biological threats.
We also believe that complementary advances in AI and biosecurity research will increasingly provide fertile ground for founders to build new mission-driven startups that can harness the entrepreneurial spirit to help solve these challenges. Safety and security are not just aspects of AI models and products—they are increasingly indispensable services and sectors that will pencil out for investors. We will be actively involved in accelerating this.
We look forward to more collaboration with governments, researchers, and entrepreneurs around the world—not only to ensure that the biosecurity ecosystem is prepared, but to take advantage of the astonishing breakthroughs that are still to come.
Generated by RSStT. The copyright belongs to the original author.