Introducing the OpenAI Safety Bug Bounty program
OpenAI News今天, OpenAI 正式推出面向公众的 Safety Bug Bounty 计划,重点识别其产品中的 AI 滥用和安全风险。随着人工智能技术快速发展,潜在滥用手段也在演变。 OpenAI 表示,目标是确保其系统在面对可能导致实质性危害的滥用时仍能保持安全可靠。
该计划将作为现有 Security Bug Bounty 的补充,接受那些虽不构成传统安全漏洞但确实带来重要滥用或安全风险的问题报告。提交的问题将由 OpenAI 的安全与漏洞赏金团队进行分检,并根据范围与归属在两个计划间调度。
项目概览
新的 Safety Bug Bounty 计划聚焦以下与 AI 特性相关的安全情景:
Agentic 风险(包括 MCP)
- 第三方提示注入与数据外泄:当攻击者的文本能可靠地劫持受害者的 agent(例如 Browser、 ChatGPT Agent 及类似 agentic 产品),诱使其执行有害操作或泄露用户敏感信息,且该行为在至少 50% 的情况下可被重复触发。
- agentic 的 OpenAI 产品在 OpenAI 网站上大规模执行不被允许的操作。
- agentic 的 OpenAI 产品执行其他潜在有害但未在上文列举的行为;此类报告须指出具有合理且实质性的危害情形。
- 针对 MCP 风险的任何测试必须遵守相关第三方的服务条款。
OpenAI 专有信息
- 模型生成内容披露与推理相关的专有信息。
- 导致其他 OpenAI 专有信息泄露的漏洞。
账户与平台完整性
- 影响账户完整性或平台完整性信号的漏洞,例如绕过反自动化控制、操纵账户信任信号、规避账户限制/暂停/封禁等问题。
- 允许用户访问超出授权权限的功能、数据或特性的漏洞,应当向 Security Bug Bounty 报告。
该计划不覆盖 jailbreak 类问题;不过 OpenAI 会不定期开展针对特定危害类型的私人赏金活动,例如针对 ChatGPT Agent 的 Biorisk 内容问题和 GPT‑5 的相关活动。感兴趣的研究人员可在这些活动开启时申请参与。
除上述类别外,如果研究人员发现能够直接导致用户伤害的缺陷,并且可提供可执行、明确的修复步骤,可按个案考虑纳入奖励范围。仅仅规避内容政策但无法证明带来安全或滥用影响的情形则不在此计划范围内。例如,那些仅使模型使用粗俗语言或返回可通过搜索引擎轻易查到的信息的“jailbreaks”即属不在范围之列。
如何参与
有意参与的研究人员可通过 Safety Bug Bounty 计划申请。 OpenAI 表示,期待与研究人员、道德黑客及安全社区携手,共同维护一个更安全的 AI 生态。
Today, OpenAI is launching a public Safety Bug Bounty program focused on identifying AI abuse and safety risks across our products. As AI technology rapidly evolves, so do the potential ways it can be misused. Our goal is to ensure our systems remain safe and secure against misuse or abuse that could lead to tangible harm.
This new program will complement OpenAI’s Security Bug Bounty by accepting issues that pose meaningful abuse and safety risks, even if they don’t meet the criteria for a security vulnerability. Through this program, we look forward to continuing to partner with safety and security researchers to help us identify and address issues that fall outside conventional security vulnerabilities but still pose real risks. Submissions will be triaged by OpenAI’s Safety and Security Bug Bounty teams, and may be rerouted between the two programs depending on scope and ownership.
Program overview
The new Safety Bug Bounty program focuses on AI-specific safety scenarios listed below:
Agentic Risks including MCP
- Third party prompt injection and data exfiltration: when attacker text is able to reliably hijack a victim’s agent (including Browser, ChatGPT Agent, and similar agentic products) to trick it into performing a harmful action or leaking the user’s sensitive information. The behavior must be reproducible at least 50% of the time.
- An agentic OpenAI product performs a disallowed action on OpenAI’s website at scale.
- An agentic OpenAI product performs some potentially harmful action not listed above. Valid reports here must indicate plausible and material harm.
- Any testing for MCP risk must comply with the terms of service of any third parties.
OpenAI Proprietary Information
- Model generations that return proprietary information related to reasoning.
- Vulnerabilities that expose other OpenAI proprietary information.
Account and Platform Integrity
- Vulnerabilities in account integrity and platform integrity signals, such as bypassing anti-automation controls, manipulating account trust signals, evading account restrictions/suspensions/bans, and similar issues.
- Issues that allow users to access features, data, or functionalities beyond authorized permissions should be reported to the Security Bug Bounty.
While jailbreaks are out of scope for this program, we periodically run private bug bounty campaigns focused on certain harm types, such as Biorisk content issues in ChatGPT Agent and GPT‑5. We invite interested researchers to apply to these programs when they arise.
Outside of the categories listed above, if researchers identify flaws that facilitate direct paths to user harm and actionable, discrete remediation steps, these may be considered in scope for rewards on a case-by-case basis. General content-policy bypasses without demonstrable safety or abuse impact are out of scope for this program. For example, “jailbreaks” that result in the model using rude language or returning information that is easily findable via search engines are out of scope.
How to participate
Researchers interested in participating can apply through our Safety Bug Bounty program. We look forward to working alongside researchers, ethical hackers, and the safety and security community in the pursuit of a secure AI ecosystem.
Generated by RSStT. The copyright belongs to the original author.