Advancing independent research on AI alignment

Advancing independent research on AI alignment

OpenAI News

随着 AI 系统变得越来越强大、越来越自主,对齐(alignment)研究既要跟上能力进展的速度,也要扩大研究路线的多样性。我们认为,在这方面既需要像在前沿开展工作的实验室,也需要独立的研究力量。在 OpenAI ,我们在前沿对齐和安全研究上投入了大量资源,因为这与我们的使命直接相关;同时我们也认为,确保 AGI 对所有人都是安全且有益的,不能由任何单一组织完成,因此应支持可以在前沿实验室之外开展的独立研究与理念探索。我们对未来 AI 的发展路径并不自信于某一种预测,也希望更多人参与塑造这一结果。

今天我们宣布向 The Alignment Project 提供 750 万美元的资助(按当前汇率约为 560 万英镑)。该项目是由 UK AISI 发起的一个面向独立对齐研究的全球基金,Renaissance Philanthropy 将负责该笔资助的管理。此项资助使 The Alignment Project 成为迄今为止支持独立对齐研究的最大专项资金之一,并有助于壮大更广泛的独立研究生态。

像 OpenAI 这样的前沿实验室处于独特位置,能够开展那些依赖前沿模型访问与大量算力的对齐研究——这些工作常常对独立研究者来说难以开展。我们把大量内部对齐工作放在发展可扩展方法上,以便对齐进展能与能力提升同步。我们认为“逐步部署”(iterative deployment)——即在稳步增强能力的同时不断强化防护——有助于及早发现问题,并为哪些方法在实践中有效提供实证依据;负责任的发展要求把大量对齐与安全工作与模型构建和部署紧密结合。

与此同时,独立且持续的探索性研究对该领域同样重要——它能拓展思路空间并发现新的方向。独立研究仍然不可或缺:在许多有价值的探究中,实验室并不一定保有比较优势。一个健康的对齐生态需要独立团队去检验多样化假设、发展替代框架,并探索那些可能与任何单一组织路线图不完全契合的概念性、理论性或大胆设想。

此外,通往 AGI 的进展可能最终取决于改变对齐问题性质的基础性突破,这会影响哪些方法最为有用。因此,应当支持那些即便当下主流方法未能按预期扩展时仍然重要的研究。在这种情形下,外部生态在做基础性、概念性且相互不相关的工作时尤其重要。AI 对齐与安全的问题前所未有,其重要性要求我们共同努力——因为随着能力继续提升,我们尚不清楚哪种方法会更具持久性。

我们的这笔资助(按当前汇率约为 560 万英镑)将与其他公共、慈善和产业支持者一道,为 The Alignment Project 提供联合资金。该基金总额超过 2700 万英镑,旨在资助全球范围内广泛的对齐研究项目,涵盖计算复杂性理论、经济与博弈论、认知科学、信息论与密码学等多种主题。单个项目的典型资助规模为 5 万至 100 万英镑,并可能附带可选的算力资源和专家支持。

我们此次资助不会新设项目或改变既有的遴选流程;也不会影响现有流程的决策。它的作用是增加本轮已通过评审的优质项目中能够获得资助的数量。UK AISI 具备在这一规模与范围上配置对齐资金的条件:它召集了涵盖政府、学界、慈善与产业的跨部门联盟,已有的拨款流程在运行中,并拥有一批经专家评审的提案可供选择。作为隶属于 DSIT 的英国政府研究机构,UK AISI 的任务聚焦于严肃的 AI 风险,并有运行研究资助项目的经验。

因为 AI 的未来不会完全照任何人的预想展开,且可能非常迅速,我们认为推动民主化、提升“AI 弹性”(AI resilience)和践行逐步部署至关重要。在 OpenAI 我们将继续推进前沿对齐与安全研究,但随着能力进步,多元且独立的生态去追求互补路径也会带来裨益。这笔资助是朝着该目标迈出的一个步骤。我们期待在这一领域继续与更广泛的研究社区合作。



As AI systems become more capable and more autonomous, alignment research needs to both keep pace and scale diversity. At OpenAI, we invest heavily in frontier alignment and safety research as it is critical to our mission. We also believe that ensuring that AGI is safe and beneficial to everyone cannot be achieved by any single organization and want to support independent research and conceptual approaches that can be pursued outside of frontier labs. We believe the future of AI won’t unfold exactly as anyone predicts, and that many more people should have a stake in shaping the outcome.


Today, we’re announcing a $7.5 million grant to The Alignment Project⁠, a global fund for independent alignment research created by the UK AI Security Institute (UK AISI). Renaissance Philanthropy is supporting the grant’s administration. This contribution helps make The Alignment Project one of the largest dedicated funding efforts for independent alignment research to date and strengthens the broader, independent ecosystem.


Frontier labs like OpenAI are in a unique position to pursue alignment research that depends on access to frontier models and significant compute—work that is often difficult for independent researchers to explore. We devote much of our internal alignment effort to developing scalable methods so that alignment progress keeps pace with capability progress. We believe iterative deployment—gradually increasing capabilities while strengthening safeguards—helps surface problems early and gives us concrete evidence about what works in practice, and that responsible development requires significant alignment and safety work that is tightly integrated with model building and deployment.


In parallel, the field benefits from sustained investment in independent, exploratory research—which can expand the space of ideas and uncover new directions. Independent research remains essential; in many kinds of useful inquiry, labs do not retain a comparative advantage. A healthy alignment ecosystem depends on independent teams testing diverse assumptions, developing alternative frameworks, and exploring conceptual, theoretical, and blue-sky ideas that may not align neatly with any one organization’s roadmap.


And because progress toward AGI may ultimately depend on fundamental breakthroughs that change the shape of the alignment problem and which approaches are most useful, it’s important to support research that would matter even if today’s dominant methods turn out not to scale in the way we expect. In those worlds, it becomes especially important to have a strong external ecosystem doing foundational, conceptual, and uncorrelated work. The problem of AI alignment and safety is of unprecedented importance, and we need all hands on deck as we do not yet know which approaches will prove most durable as capabilities continue to advance.


Our grant—approximately £5.6 million at current exchange rates—will co-fund The Alignment Project⁠ alongside other public, philanthropic, and industry backers. The total fund exceeds £27 million and is designed to support a broad portfolio of alignment research projects worldwide, spanning topics as diverse as computational complexity theory, economic theory and game theory, cognitive science, and information theory and cryptography. Individual projects are typically funded at £50,000 to £1 million, and may also receive optional access to compute resources and expert support.


Our funding does not create a new program or selection process, nor influence the existing process; it increases the number of already-vetted, high-quality projects⁠ that can be funded in the current round.


UK AISI is well positioned to direct alignment funding at this scale and range. It brings an established cross-sector coalition spanning government, academia, philanthropy, and industry, along with a grantmaking pipeline already in motion and a large pool of proposals that have undergone expert review. As a UK government research organization within the Department for Science, Innovation and Technology (DSIT), it also has a mandate focused on serious AI risks and is experienced with running research funding programs.


Because the future of AI won’t unfold exactly as anyone predicts—and may advance very quickly—we believe democratization, “AI resilience,” and iterative deployment are essential. While we continue advancing our frontier alignment and safety research at OpenAI, progress will benefit from a robust, diverse, independent ecosystem pursuing complementary approaches as capabilities advance. This grant is one step toward that goal. We look forward to continuing to collaborate with the broader research community as the field advances.



Generated by RSStT. The copyright belongs to the original author.

Source

Report Page