Shipping smarter agents with every new model
OpenAI NewsSafetyKit(https://www.safetykit.com/)构建多模态AI代理,帮助市场平台、支付平台和金融科技公司检测并应对文本、图像、金融交易、商品列表等多种内容中的欺诈和违规行为。近期在模型推理和多模态理解方面的突破,使得这一能力更为高效,树立了风险、合规和安全运营的新标杆。
SafetyKit的代理利用GPT-5、GPT-4.1、深度研究和计算机使用代理(CUA),根据SafetyKit的评估,能够以超过95%的准确率审查100%的客户内容。它们帮助平台保护用户、防止欺诈、避免监管罚款,并执行传统系统可能遗漏的复杂政策,如地区特定规则、嵌入诈骗图片中的电话号码或露骨内容。自动化还可以保护人工审核员免受冒犯性材料的影响,使其专注于处理细致的政策决策。
SafetyKit创始人兼CEO David Graunke表示:“OpenAI为我们提供了市场上最先进的推理和多模态模型,使我们能够快速适应,更快推出新代理,并处理其他解决方案甚至无法解析的内容类型。”
设计代理以满足任务需求,然后选择合适的模型
SafetyKit的每个代理都针对特定风险类别构建,从诈骗到非法产品。每条内容都会被分配给最适合处理该违规的代理,使用最优的OpenAI模型:
- GPT-5跨文本、图像和用户界面进行多模态推理,揭示隐藏风险,支持分层且精准的决策
- GPT-4.1可靠地遵循详细的内容政策指令,高效管理大规模审核工作流
- 强化微调(RFT)提升召回率和精准度,超越默认模型,在复杂安全政策上达到前沿表现
- 深度研究将实时在线调查整合到商户审核和验证中
- 计算机使用代理(CUA)自动化复杂政策任务,减少对昂贵人工审核的依赖
这种模型匹配方法使SafetyKit能够以比传统方案更细致、更准确的方式跨多模态扩展内容审核。
例如,诈骗检测代理不仅扫描文本,还分析二维码或嵌入商品图片中的电话号码。GPT-4.1帮助解析图像、理解布局,并判断是否违反政策。
政策披露代理检查商品列表或着陆页是否包含必需的语言,如法律免责声明或地区合规警告。GPT-4.1提取相关部分,GPT-5评估合规性,代理据此标记违规。
Graunke说:“我们将代理视为专门构建的工作流。有些任务需要深度推理,有些则需要多模态上下文。OpenAI是唯一能在两者间提供可靠性能的技术栈。”
利用GPT-5应对灰色地带和高风险决策
政策决策常常依赖细微差别。比如某市场要求卖家为健康产品添加免责声明,且要求因产品声明和地区规则而异。传统供应商使用关键词触发或僵硬规则,可能错过这些决策所需的深层判断,导致执行遗漏或错误。
SafetyKit的政策披露代理首先参考内部政策库,然后由GPT-5评估内容:是否提及治疗或预防?是否在必须披露的地区销售?如果是,列表中是否包含所需语言?若有不足,GPT-5会返回结构化输出,供代理标记问题。
Graunke指出:“GPT-5的强大之处在于它在真实政策基础上能精准推理,使我们即使在边缘案例也能做出准确且有理有据的决策。”
将每次模型发布转化为产品优势
SafetyKit对每个新OpenAI模型进行最难案例的基准测试,通常当天就部署表现最佳的模型。严格的内部评估帮助团队快速识别新模型如何提升性能,并无缝集成到核心架构中。
OpenAI o3发布时,SafetyKit利用它提升关键政策领域的边缘案例表现。随后GPT-5上线,几天内便部署到最苛刻的代理中,在最难的视觉任务上提升基准分数超过10分。
Graunke说:“OpenAI更新迅速,我们的系统设计也能跟上。每次新发布都带来运营优势——解锁新能力和领域,提升我们为客户提供的覆盖率和准确率。”
SafetyKit还将改进反馈给生态系统,直接与OpenAI分享评估结果、边缘案例失败和政策相关见解,助力未来模型在安全关键工作负载上的表现。
利用最佳OpenAI技术栈实现客户和业务规模增长
SafetyKit架构支持大规模政策执行,提供速度、精准和全面风险覆盖。后台每天处理超过160亿个token,六个月前仅为2亿,分析更多内容的同时不牺牲准确性。
同期,SafetyKit业务扩展至支付风险、欺诈、反儿童剥削和反洗钱领域,新增客户保护数亿终端用户。该基础设施使客户能够迅速且自信地应对新兴风险。
Graunke总结:“我们建立了一个循环,每次OpenAI发布都直接增强我们的能力。这就是系统不断进步、始终领先于风险演变的原因。”
想了解更多关于ChatGPT for Business的信息?
请联系我们的团队:https://openai.com/contact-sales/
SafetyKit builds multimodal AI agents to help marketplaces, payment platforms, and fintechs detect and act on fraud and prohibited activity across text, images, financial transactions, product listings, and more. Recent breakthroughs in model reasoning and multimodal understanding now make this more effective, setting a new bar for risk, compliance, and safety operations.
SafetyKit’s agents leverage GPT‑5, GPT‑4.1, deep research, and Computer Using Agent (CUA) to review 100% of customer content with over 95% accuracy based on SafetyKit’s evals. They can help platforms protect users, prevent fraud, avoid regulatory fines and enforce complex policies that legacy systems may miss, such as region-specific rules, embedded phone numbers in scam images, or explicit content. Automation can also protect human moderators from exposure to offensive material and frees them to handle nuanced policy decisions.
“OpenAI gives us access to the most advanced reasoning and multimodal models on the market. It lets us adapt quickly, ship new agents faster, and handle content types other solutions can’t even parse.”David Graunke, Founder and CEO of SafetyKit
Design agents for what the task demands, then choose the right model
SafetyKit’s agents are each built to handle a specific risk category, from scams to illegal products. Every piece of content is routed to the agent best suited for that violation, using the optimal OpenAI model:
- GPT‑5 applies multimodal reasoning across text, images, and UI to surface hidden risks and support layered, precise decision-making
- GPT‑4.1 reliably follows detailed content-policy instructions and efficiently manages high-volume moderation workflows
- Reinforcement fine-tuning (RFT) boosts recall and precision beyond default models, achieving frontier performance with complex safety policies
- Deep research integrates real-time online investigation into merchant reviews and verifications
- Computer Using Agent (CUA) automates complex policy tasks, reducing reliance on costly manual reviews
This model-matching approach lets SafetyKit scale content review across modalities with more nuance and accuracy than legacy solutions can.
The Scam Detection agent, for example, goes beyond just scanning text. It analyzes visuals like QR codes or phone numbers embedded in product images. GPT‑4.1 helps it parse the image, understand the layout, and decide whether it is a policy violation.
The Policy Disclosure agent checks listings or landing pages for required language, such as legal disclaimers or region-specific compliance warnings. GPT‑4.1 extracts relevant sections, GPT‑5 evaluates compliance, and the agent flags violations.
“We think of our agents as purpose-built workflows,” says Graunke. “Some tasks require deep reasoning, others need multimodal context. OpenAI is the only stack that delivers reliable performance across both.”
Leverage GPT‑5 to navigate the gray areas and high-stakes decisions
Policy decisions often hinge on subtle distinctions. Take a marketplace requiring sellers to include a disclaimer for wellness products, with requirements varying based on product claims and regional rules. Legacy providers use keyword triggers or rigid rulesets, which can miss the deeper judgment calls these decisions may require, leading to missed or incorrect enforcement.
SafetyKit’s Policy Disclosure agent first references policies from SafetyKit’s internal library then GPT‑5 evaluates the content: does it mention treatment or prevention? Is it being sold in a region where disclosure is mandatory? And if so, is the required language actually included in the listing? If anything falls short, GPT‑5 returns a structured output the agent uses to flag the issue.
“The power of GPT‑5 is in how precisely it can reason when grounded in real policy,” notes Graunke. “It lets us make accurate, defensible decisions even in the edge cases where other systems fail.”
Turn every model release into a product win
SafetyKit benchmarks each new OpenAI model against its hardest cases, often deploying top performers the same day. Rigorous internal evaluations allow the team to quickly identify how new models can improve performance and seamlessly integrate into their core infrastructure.
When OpenAI o3 launched, SafetyKit used it to boost edge case performance across key policy areas. GPT‑5 followed, and within days, it was deployed across their most demanding agents, improving benchmark scores by more than 10 points on their toughest vision tasks.
“OpenAI moves fast, and we’ve designed our system to keep up. Every new release gives us an operational edge–unlocking new capabilities and domains we couldn’t support before, and increasing the coverage and accuracy we deliver to customers,” says Graunke.
SafetyKit also feeds improvements back into the ecosystem, sharing eval results, edge case failures, and policy-specific insights directly with OpenAI to help shape future model performance for safety-critical workloads.
Scale customer and volume growth with the best OpenAI stack
SafetyKit’s architecture enforces policy at scale, delivering speed, precision, and comprehensive risk coverage. Behind the scenes, it now handles over 16 billion tokens daily, up from 200 million six months ago, analyzing more content without sacrificing accuracy.
In that same time, SafetyKit has expanded to SafetyKit has expanded to payments risk, fraud, anti-child-exploitation, and anti-money laundering, and new customers with hundreds of millions of end users under SafetyKit’s protection. This foundation empowers customers to respond swiftly and confidently to emerging risks.
“We’ve created a loop where every OpenAI release directly strengthens our capabilities,” says Graunke. “That’s why the system continually improves, always staying ahead of evolving risks.”
Interested in learning more about ChatGPT for business?
Talk with our teamGenerated by RSStT. The copyright belongs to the original author.