Introducing Lockdown Mode and Elevated Risk labels in ChatGPT
OpenAI News随着 AI 系统承担越来越复杂的任务——尤其是那些涉及互联网和互联应用的任务——安全风险也在发生变化。一个日益突出的威胁是: prompt injection 。在此类攻击中,第三方试图误导对话式 AI 去执行恶意指令或泄露敏感信息。
今天我们推出两项新防护,旨在让用户和组织更清楚地了解风险并获得更强的控制手段:
- 在 ChatGPT 中新增的可选高级安全设置 Lockdown Mode ;
- 在部分功能上统一使用的 “ Elevated Risk ” 标签,适用于 ChatGPT 、 ChatGPT Atlas 和 Codex ,这些功能可能带来额外风险。
这些举措建立在我们现有的多层防护之上,包括 sandboxing 、针对 URL 数据窃取的防护( URL-based data exfiltration )、监测与执法机制,以及面向企业的控制措施(如基于角色的访问与审计日志)。
面向易受网络攻击影响员工的保护
Lockdown Mode 是为少数对安全有极高要求的用户设计的可选高级安全设置,例如大型机构的高管或安全团队,普通用户通常不需要启用。该模式通过严格限制 ChatGPT 与外部系统的交互,降低基于 prompt injection 的数据外泄风险。
在 Lockdown Mode 下,系统会以确定性的方式停用某些工具和功能,防止对手利用这些能力通过 prompt injection 等手段将用户对话或连接应用中的敏感数据窃取出去。例如, Lockdown Mode 的网页浏览仅限于缓存内容,不会发出离开 OpenAI 控制网络 的实时网络请求;当我们无法提供确定性的强数据安全保障时,某些功能会被完全禁用。
ChatGPT 的商务计划本就提供企业级数据安全保障, Lockdown Mode 在这些基础上加了一层限制。目前该模式面向 ChatGPT Enterprise 、 ChatGPT Edu 、 ChatGPT for Healthcare 和 ChatGPT for Teachers 开放。管理员可在 Workspace Settings 中通过创建新角色来启用该模式;启用后, Lockdown Mode 会在现有管理员设置上叠加额外限制。更多信息见 Help Center 。
考虑到一些关键工作流依赖应用程序,工作区管理员仍可保留更细粒度的控制权:他们可以精确选择在 Lockdown Mode 下允许哪些应用及这些应用中的哪些具体操作。独立于 Lockdown Mode 的 Compliance API Logs Platform 则为管理员提供关于应用使用、共享数据与连接来源的详细可见性,便于监督管理。
我们计划在未来数月内把 Lockdown Mode 推向个人用户。
帮助用户在风险上做出知情选择
当 AI 产品连接到你的应用和网络时,其有用性会增强;我们也在大量投入以保障连接数据的安全。但同时,一些与网络相关的能力会引入行业现有安全措施尚未完全覆盖的新风险。部分用户可能愿意承担这些风险,因此我们认为用户应有权决定是否以及如何启用这些功能,尤其是在处理私人数据时。
我们的做法是对可能带来额外风险的功能在产品内部提供明确指引。为使指引更清晰统一,我们将对一小类现有能力统一贴上“ Elevated Risk ”标签,确保用户在 ChatGPT 、 ChatGPT Atlas 和 Codex 中遇到这些功能时都会看到一致的提示。
例如在编码助手 Codex 中,开发者可以授权 Codex 网络访问,使其能够在网上执行操作(例如查询文档)。相应的设置界面会标注“ Elevated Risk ”,并清楚说明设置会带来哪些变化、可能引入哪些风险以及何时适合启用。
下一步
我们将继续加大对安全措施的投入,特别是针对新出现、正在增长或尚不成熟的风险。在这些功能的安全性得到足够提升、风险被充分缓解后,我们会移除相应的“ Elevated Risk ”标签;与此同时,也会根据需要持续更新哪些功能应贴上此标签,以便更好地向用户传达风险信息。
As AI systems take on more complex tasks—especially those that involve the web and connected apps—the security stakes change.
One emerging risk has become especially important: prompt injection. In these attacks, a third party attempts to mislead a conversational AI system into following malicious instructions or revealing sensitive information.
Today, we’re introducing two new protections designed to help users and organizations mitigate prompt injection attacks, with clearer visibility into risk and stronger controls:
- Lockdown Mode in ChatGPT, an advanced, optional security setting for higher-risk users
- “Elevated Risk” labels for certain capabilities in ChatGPT, ChatGPT Atlas, and Codex that may introduce additional risk
These additions build on our existing protections across the model, product, and system levels. This includes sandboxing, protections against URL-based data exfiltration, monitoring and enforcement, and enterprise controls like role-based access and audit logs.
Helping organizations protect employees most at-risk of cyberattacks
Lockdown Mode is an optional, advanced security setting designed for a small set of highly security-conscious users—such as executives or security teams at prominent organizations—who require increased protection against advanced threats. It is not necessary for most users. Lockdown Mode tightly constrains how ChatGPT can interact with external systems to reduce the risk of prompt injection–based data exfiltration.
Lockdown Mode deterministically disables certain tools and capabilities in ChatGPT that an adversary could attempt to exploit to exfiltrate sensitive data from users’ conversations or connected apps via attacks such as prompt injections.
For example, web browsing in Lockdown Mode is limited to cached content, so no live network requests leave OpenAI’s controlled network. This restriction is designed to prevent sensitive data from being exfiltrated to an attacker through browsing. Some features are disabled entirely when we can’t provide strong deterministic guarantees of data safety.
Lockdown Mode is a new deterministic setting that helps guard data from being inadvertently shared with third parties by tightly constraining how ChatGPT can interact with certain external systems.
ChatGPT business plans already provide enterprise-grade data security. Lockdown Mode builds on those protections and is available for ChatGPT Enterprise, ChatGPT Edu, ChatGPT for Healthcare, and ChatGPT for Teachers. Admins can enable it in Workspace Settings by creating a new role. When enabled, Lockdown Mode layers additional restrictions on top of existing admin settings.
Learn more about Lockdown Mode in our Help Center.
Because some critical workflows rely on apps, Workspace Admins retain more granular controls. They can choose exactly which apps—and which specific actions within those apps—are available to users in Lockdown Mode. Additionally, and separate from Lockdown Mode, the Compliance API Logs Platform provides detailed visibility into app usage, shared data, and connected sources, helping admins maintain oversight.
We plan to make Lockdown Mode available to consumers in the coming months.
Helping users make informed choices about risk
AI products can be more helpful when connected to your apps and the web, and we’ve invested heavily in keeping connected data secure. At the same time, some network-related capabilities introduce new risks that aren’t yet fully addressed by the industry’s safety and security mitigations. Some users may be comfortable taking on these risks, and we believe it’s important for users to have the ability to decide whether and how to use them, especially while working with their private data.
Our approach has been to provide in-product guidance for features that may introduce additional risk. To make this clearer and more consistent, we’re standardizing how we label a short list of existing capabilities. These features will now use a consistent “Elevated Risk” label across ChatGPT, ChatGPT Atlas, and Codex, so users receive the same guidance wherever they encounter them.
For example, in Codex, our coding assistant, developers can grant Codex network access so it can take actions on the web like looking up documentation. The relevant settings screen includes the “Elevated Risk” label, along with a clear explanation of what changes, what risks may be introduced, and when that access is appropriate.

A screenshot of the Codex settings screen where users can configure what network access Codex has.
What’s next
We continue to invest in strengthening our safety and security safeguards, especially for novel, emerging, or growing risks. As we strengthen the safeguards for these features, we will remove the “Elevated Risk” label once we determine that security advances have sufficiently mitigated those risks for general use. We will also continue to update which features carry this label over time to best communicate risk to users.
Generated by RSStT. The copyright belongs to the original author.