Doppel’s AI defense system stops attacks before they spread
OpenAI News一个冒名网站可以在不到一小时内上线、瞄准数千名用户并消失。对攻击者来说,这段时间足以造成实质性损害;而借助生成式工具,他们还可以在几秒钟内批量生成数百个同类变体。
最初为防范深度伪造和在线冒充而建立的 Doppel 很快意识到,人工智能让威胁具备了无限扩展的能力。攻击者无需再手工打造骗局;他们可以快速生成无数的钓鱼套件、伪造域名和冒充账号变体。
“钓鱼攻击的损害能够在几分钟内随着社交媒体和消息渠道蔓延开来。几乎零成本地生成无限说服力,这一改变颠覆了一切。” —— Rahul Madduluri , Doppel 联合创始人兼 CTO
为保持领先, Doppel 构建了一种基于 OpenAI 的 GPT‑5 与 o4-mini 模型的新型社交工程防御系统。 Doppel 的平台能够自动检测、分类并下线威胁,将分析师的工作量削减约 80%,将威胁处置能力提高三倍,并把响应时间从数小时缩短到数分钟。
提前应对极速放量的威胁
传统的数字风险防护依赖人工审查冒充网站、钓鱼域名以及社交媒体账号与帖文。随着攻击者开始自动化、以超出人工评估能力的速率和覆盖面发起攻击, Doppel 看到这种做法正在失效。
“我们的系统要从持续涌入的信号中在噪声中识别出真实威胁。一旦发现威胁,可供行动的时间窗口非常狭窄,” Rahul 说,“用 AI 自动化决策是公司能做出的最大变革之一,使我们能够以互联网级的规模和速度对抗攻击。”
速度对 Doppel 的客户至关重要——那些不能等上数小时才确认威胁的组织。 Doppel 的系统在大多数情况下能够自动对威胁进行分类,关键在于使用 OpenAI 模型进行推理,并通过称为 强化微调( RFT )的结构化反馈回路随着时间改进模型。在 RFT 中,人工反馈被作为分级样本,帮助模型学会自行做出一致且可解释的判断。
用大模型编排威胁检测
Doppel 的以大模型为核心的流水线是其检测体系的中枢。信号被采集并过滤后,系统执行一系列有针对性的推理任务:评估潜在威胁、确认意图并推动分类决策。各阶段在速度、准确性与一致性之间保持平衡,同时把分析师的注意力聚焦在那些需要人工判断的边缘案例上。
流程大致如下:
- 信号过滤与特征提取: Doppel 的系统每天摄取数以百万计的域名、URL 和账号。启发式规则与 OpenAI 的 o4-mini 共同过滤噪声,提取结构化特征以供后续模型评估使用。
- 并行威胁确认:每个信号会被送入多个针对不同威胁类型设计的 GPT‑5 提示,这些提示评估冒充风险、品牌滥用或社交工程模式等因素。
- 威胁分类:经过 RFT 训练的 o4-mini 将此前各项确认综合起来,给出结构化标签——恶意、良性或不确定——以达到生产级别的一致性。
- 最终核验:由第二次 GPT‑5 通过验证模型决策并生成自然语言的理由。当置信度超过阈值时,系统会自动触发执行。
- 人工复核:对于低置信度或存在冲突的结果,会交由人工分析师处理。他们的决策会被记录并回馈到 RFT 循环中,以持续提升模型一致性。
通过强化微调训练模型
在最初引入大模型增强的检测流水线后, Doppel 已经看到明显成效,但在一些同一威胁会因分析师而有不同判定的情形下,一致性成为瓶颈。
“我们在边缘案例中看到答案存在差异,”软件工程师 Kiran Arimilli 说,“RFT 带来的一个真实好处是让模型决策更一致。”
为建立这种一致性, Doppel 用自家分析师的判定作为 RFT 的反馈源。每次将某个域名标为恶意、良性或不明确,都会成为分级样本。这些带标签的样本训练模型去复制专家判断,即便在模糊的边缘案例也能给出和专家一致的结论。
与 OpenAI 应用工程团队紧密合作, Doppel 设计了评分函数,不仅评估准确性,也考量解释质量,奖励那些能清晰推理而非仅仅正确的模型。通过把分析师反馈结构化为训练数据, Doppel 演示了 RFT 如何让自动化检测更具一致性和可靠性。
通过透明化把信任变成可操作
超参数调优和反复评估让模型接近人类级的一致性。但对 Doppel 而言,要完成自动化的最后一里,还需要让决策立刻可被理解。
现在每次自动下线都会附带一段 AI 生成的理由,解释为何移除该威胁,客户可以立即看到采取行动的依据——这在过去需要分析师介入才能获得的可见性。
这种透明度提升了信任,对 Doppel 的用户至关重要。看到不仅是采取了什么行动,还有为什么这么做,能让团队更快响应,并在内部或对外部利益相关方时有据可依。
将自动化扩展到更多威胁面
在对钓鱼和冒充域名实现近乎端到端自动化后, Doppel 现在把同一套以模型为驱动的框架推广到其他高变异性的渠道。
“域名可能是我们处理的最难的渠道,” Rahul 说,“信号很混乱,内容不断变化,威胁在多个面同时快速演化。如果我们能端到端自动化域名检测,那么社交媒体、付费广告等任何渠道都可以照搬。”
下一步目标包括把 RFT 数据集规模扩大一个数量级、试验新的评分策略,并把 GPT‑5 用于上游特征提取。这些变化将使 Doppel 能够合并流水线阶段,并在流程更早期就对更复杂的威胁指示器进行推理。
每一次迭代, Doppel 都在朝着构建一个能够在所有受信任关系受到攻击的表面上保护真实事物的系统迈进。
OpenAI 欢迎初创团队加入建设。
A single impersonation site can launch, target thousands of users, and vanish in under an hour. That’s more than enough time for an attacker to do real damage. And with generative tools, they can spin up hundreds more just like it.
Doppel was built to defend organizations from deepfakes and online impersonations, but quickly realized AI meant threats could scale infinitely. Attackers no longer needed to handcraft scams; they could generate endless variants of phishing kits, spoofed domains, and impersonation accounts in seconds.
“Damage from phishing attacks can happen within minutes as they spread across social media and messaging channels. The ability to generate infinite persuasion at almost no cost changed everything.”Rahul Madduluri, Co-founder and CTO, Doppel
To stay ahead, Doppel developed a new kind of social engineering defense system built on OpenAI GPT‑5 and o4-mini models. Doppel’s platform detects, classifies, and takes down threats autonomously, cutting analyst workloads by 80%, triples threat-handling capacity, and reduces response times from hours to minutes.
Staying ahead of infinitely faster threats
Traditional digital risk protection relied on humans to manually review impersonation sites, phishing domains, and social media profiles and posts. Doppel saw that model breaking down as attackers began to automate, launching threats faster, and across more surface areas, than humans could evaluate them.
“Our system processes a constant flood of signals to identify the real threats amongst the noise. Once a threat is detected, there is a very narrow window to act before the damage is done,” said Rahul. “Using AI to automate decision-making is one of the greatest unlocks for the company, allowing us to combat attacks at internet scale and speed.”
That speed is critical for Doppel’s customers, organizations that can’t afford to wait hours to confirm a threat. Doppel’s system classifies most threats automatically, using OpenAI models for reasoning and a structured feedback loop known as reinforcement fine-tuning (RFT) to improve the model over time. In RFT, human feedback is used as graded examples, helping models learn to make consistent, explainable decisions on their own.
Orchestrating LLM-driven threat detection
Doppel’s LLM-driven pipeline sits at the center of its detection stack. After signals are sourced and filtered, the system performs a series of targeted reasoning tasks: reasoning through potential threats, confirming intent, and driving classification decisions. Each stage is designed to balance speed, accuracy, and consistency, while keeping analysts focused on the edge cases that need human judgment.
Here’s how it works:
- Signal filtering and feature extraction: Doppel’s systems ingest millions of domains, URLs, and accounts daily. A combination of heuristics and OpenAI o4-mini filters out noise and extracts structured features to guide downstream model evaluations.
- Parallel threat confirmation: Each signal is passed through multiple GPT‑5 prompts purpose-built for different types of threat analysis. These prompts assess factors like impersonation risk, brand misuse, or social engineering patterns.
- Threat classification: The RFT version of o4-mini synthesizes the earlier confirmations to assign a structured label—malicious, benign, or ambiguous—with production-grade consistency.
- Final verification: A second GPT‑5 pass validates the model’s decision and generates a natural-language justification. If confidence exceeds threshold, the system auto-initiates enforcement.
- Human review: Low-confidence or conflicting results are routed to human analysts. Their decisions are logged and fed back into the RFT loop to continuously improve model consistency.
Training models through reinforcement fine-tuning (RFT)
Doppel had already seen meaningful gains from its original LLM-enhanced detection pipeline, but when it came to cases where the same threat might be judged differently depending on the analyst, consistency became the limiting factor.
“We were seeing differences in answers across edge cases,” said software engineer Kiran Arimilli. “One real benefit that came out of RFT is you’re making that model’s decisions more consistent.”
To build that consistency, Doppel applied RFT using its own analyst data as the feedback source. Each decision to classify a domain as malicious, benign, or unclear became a graded example. Those labeled examples trained the model to replicate expert judgment, even on ambiguous edge cases.
Working closely with OpenAI’s applied engineering team, Doppel designed grader functions that evaluated not only accuracy but explanatory quality, rewarding models that reasoned clearly, not just correctly. By turning analyst feedback into structured training data, Doppel helped show how RFT could make automated detection more consistent and reliable.
Operationalizing trust through transparency
Hyperparameter tuning and iterative evals brought the model closer to human-level consistency. But for Doppel, completing the final mile of automation also meant making decisions immediately understandable.
Each automated takedown now includes an AI-generated justification explaining why a threat was removed, giving customers immediate insight into why action was taken—something that once required analyst intervention.

That visibility enhances trust, which is a critical factor for Doppel’s users. Seeing not just what action was taken, but why, gives teams the confidence to respond quickly and the context to explain those decisions internally or to stakeholders.
Expanding automation to new threat surfaces
Having reached near-complete automation for phishing and impersonation domains, Doppel is now applying the same model-driven framework to other high-variance channels.
“Domains are probably the hardest channel we handle,” said Madduluri. “The signals are messy, content changes constantly, and threats evolve fast across several surfaces at once. If we can automate that end to end, we can do it for anything: social media, paid ads, you name it.”
The next milestones include scaling their RFT dataset by an order of magnitude, experimenting with new grading strategies, and using GPT‑5 for upstream feature extraction. These changes will allow Doppel to consolidate pipeline stages and reason over more complex threat indicators earlier in the process.
With each iteration, Doppel is building toward a system that defends what’s real across every surface where trust is under attack.
OpenAI <3 startups. Come build with us.
Join the communityStart buildingGenerated by RSStT. The copyright belongs to the original author.