刚刚 Gemini 3 Deep Think 升级:推理更强、科学更深,Ultra 可用

机智流 2026-02-13 11:02

AI INSIGHT · 官方博文翻译

原文来源:Google Blog · 2026.02.12 | 含 @GoogleDeepMind 推特补充

Gemini 3 Deep Think:推动科学、研究与工程的前沿

Gemini 3 Deep Think: Advancing science, research and engineering

我们最专业的推理模式现已升级,用于解决现代科学、研究与工程挑战。

Our most specialized reasoning mode is now updated to solve modern science, research and engineering challenges.

2026 年 2 月 12 日 · The Deep Think Team

刚刚 Gemini 3 Deep Think 升级:推理更强、科学更深,Ultra 可用图1

今天,我们发布了 Gemini 3 Deep Think 的重大升级——这是我们的专业推理模式,旨在推动智能前沿,解决科学、研究和工程领域的现代挑战。

Today, we're releasing a major upgrade to Gemini 3 Deep Think, our specialized reasoning mode, built to push the frontier of intelligence and solve modern challenges across science, research, and engineering.

我们与科学家和研究人员密切合作更新了 Gemini 3 Deep Think,以应对棘手的研究挑战——这些问题往往缺乏明确的规则或单一正确答案,数据通常混乱或不完整。通过将深层科学知识与日常工程实用性相融合,Deep Think 超越了抽象理论,驱动实际应用。

We updated Gemini 3 Deep Think in close partnership with scientists and researchers to tackle tough research challenges — where problems often lack clear guardrails or a single correct solution and data is often messy or incomplete. By blending deep scientific knowledge with everyday engineering utility, Deep Think moves beyond abstract theory to drive practical applications.

新版 Deep Think 现已在 Gemini 应用中面向 Google AI Ultra 订阅用户开放,同时我们首次通过 Gemini API 向部分研究人员、工程师和企业提供 Deep Think 访问权限。

The new Deep Think is now available in the Gemini app for Google AI Ultra subscribers and, for the first time, we're also making Deep Think available via the Gemini API to select researchers, engineers and enterprises.

早期用户如何使用最新 Deep Think

Here is how our early testers are already using the latest Deep Think

数学领域:罗格斯大学数学家 Lisa Carbone 研究高能物理学所需的数学结构,致力于弥合爱因斯坦引力理论与量子力学之间的鸿沟。在训练数据极为稀缺的领域中,她使用 Deep Think 审阅了一篇高度技术性的数学论文。Deep Think 成功识别出了一个此前经过人类同行评审却未被发现的微妙逻辑缺陷。

Lisa Carbone, a mathematician at Rutgers University, works on the mathematical structures required by the high-energy physics community to bridge the gap between Einstein's theory of gravity and quantum mechanics. In a field with very little existing training data, she used Deep Think to review a highly technical mathematics paper. Deep Think successfully identified a subtle logical flaw that had previously passed through human peer review unnoticed.

刚刚 Gemini 3 Deep Think 升级:推理更强、科学更深,Ultra 可用图2

罗格斯大学数学家 Lisa Carbone 使用 Deep Think 审阅数学论文

半导体材料:杜克大学的 Wang Lab 利用 Deep Think 优化复杂晶体生长的制造方法,用于发现潜在的半导体材料。Deep Think 成功设计出了生长大于 100 μm 薄膜的配方,精确达到了此前方法难以实现的目标。

At Duke University, the Wang Lab utilized Deep Think to optimize fabrication methods for complex crystal growth for the potential discovery of semiconductor materials. Deep Think successfully designed a recipe for growing thin films larger than 100 μm, meeting a precise target that previous methods had challenges to hit.

刚刚 Gemini 3 Deep Think 升级:推理更强、科学更深,Ultra 可用图3

杜克大学 Wang Lab 使用 Deep Think 优化半导体材料制造

物理组件设计:Google 平台与设备部门的研发负责人 Anupam Pathak(前 Liftware CEO)测试了新版 Deep Think 来加速物理组件的设计。

Anupam Pathak, an R&D lead in Google's Platforms and Devices division and former CEO of Liftware, tested the new Deep Think to accelerate the design of physical components.

刚刚 Gemini 3 Deep Think 升级:推理更强、科学更深,Ultra 可用图4

Anupam Pathak 使用 Deep Think 加速物理组件设计

以数学和算法的严谨性提升推理

Elevating reasoning with mathematical and algorithmic rigor

去年,我们展示了 Deep Think 的专用版本可以成功应对一些最艰难的推理挑战,在数学和编程世界锦标赛中达到金牌水准。最近,Deep Think 还使专用智能体能够进行研究级数学探索。

Last year, we showed that specialized versions of Deep Think could successfully navigate some of the toughest challenges in reasoning, achieving gold-medal standards at math and programming world championships. More recently, Deep Think has enabled specialized agents to conduct research-level mathematics exploration.

更新后的 Deep Think 模式继续推动智能前沿,在最严格的学术基准上达到新高度,包括:

The updated Deep Think mode continues to push the frontiers of intelligence, reaching new heights across the most rigorous academic benchmarks, including:

         🏆 Humanity's Last Exam:48.4%(无工具),刷新纪录
         🧠 ARC-AGI-2:84.6%,经 ARC Prize 基金会验证,前所未有
         💻 Codeforces:Elo 3455,竞技编程顶级水平
         🥇 国际数学奥林匹克 2025:金牌级表现
         🔬 国际物理/化学奥林匹克 2025:笔试部分金牌水平
         ⚛️ CMT-Benchmark:高级理论物理评分 50.5%       

刚刚 Gemini 3 Deep Think 升级:推理更强、科学更深,Ultra 可用图5

Gemini 3 Deep Think 基准测试成绩(动图)

刚刚 Gemini 3 Deep Think 升级:推理更强、科学更深,Ultra 可用图6

Gemini 3 Deep Think 评估结果对比表

推特补充:@GoogleDeepMind 官方解读

Twitter: @GoogleDeepMind official thread highlights

         📌 @GoogleDeepMind 主线程(4205 赞 / 192 万浏览):
         "我们升级了专业推理模式 Gemini 3 Deep Think,帮助解决现代科学、研究和工程挑战——推动智能前沿。🧠 看看杜克大学 Wang Lab 如何用它设计新型半导体材料。"

         📌 基准测试详解(1515 赞):
         "最新 Deep Think 超越抽象理论,驱动实际应用。在 ARC-AGI-2 上达到最新 SOTA;在 Humanity's Last Exam 上设立新标准,攻克数学、科学和工程领域最难问题;Codeforces Elo 3455,展示解决复杂真实编程任务的能力;IMO 2025 金牌级表现。"

         📌 研究论文预告(2 月 11 日,1652 赞):
         "与 @GoogleResearch 联合发布两篇新论文,展示 Gemini Deep Think 如何使用智能体工作流来帮助解决数学、物理和计算机科学领域的研究级问题。"

         📌 API 开放(548 赞):
         "升级版 Deep Think 模式正在 @GeminiApp 中向 Google AI Ultra 订阅用户推出。科研人员和开发者可申请 Vertex AI 早期访问计划。"       

刚刚 Gemini 3 Deep Think 升级:推理更强、科学更深,Ultra 可用图7

Humanity's Last Exam 基准结果(@GoogleDeepMind 推文配图)

刚刚 Gemini 3 Deep Think 升级:推理更强、科学更深,Ultra 可用图8

ARC-AGI-2 基准结果

刚刚 Gemini 3 Deep Think 升级:推理更强、科学更深,Ultra 可用图9

Codeforces 竞技编程评分

刚刚 Gemini 3 Deep Think 升级:推理更强、科学更深,Ultra 可用图10

IMO & IPhO/IChO 奥赛成绩

驾驭复杂科学领域

Navigating complex scientific domains

除了数学和竞技编程之外,Gemini 3 Deep Think 现在还在化学和物理等广泛科学领域表现出色。更新后的 Deep Think 模式在 2025 年国际物理奥林匹克和化学奥林匹克的笔试部分展示了金牌级成绩。同时在高级理论物理领域也展现了能力,在 CMT-Benchmark 上取得了 50.5% 的分数。

Beyond mathematics and competitive coding, Gemini 3 Deep Think now also excels across broad scientific domains such as chemistry and physics. Our updated Deep Think mode demonstrates gold medal-level results on the written sections of the 2025 International Physics Olympiad and Chemistry Olympiad. It also demonstrates proficiency in advanced theoretical physics, achieving a score of 50.5% on CMT-Benchmark.

加速真实世界工程

Accelerating real-world engineering

除了最先进的性能之外,Deep Think 旨在驱动实际应用,使研究人员能够解读复杂数据,工程师能够通过代码建模物理系统。最重要的是,我们正在将 Deep Think 带到研究人员和从业者最需要的地方——首先是 Gemini API 等接口。

In addition to its state-of-the-art performance, Deep Think is built to drive practical applications, enabling researchers to interpret complex data, and engineers to model physical systems through code. Most importantly, we are working to bring Deep Think to researchers and practitioners where they need it most — beginning with surfaces such as the Gemini API.

通过更新后的 Deep Think,你可以将草图变为可 3D 打印的实物。Deep Think 会分析图纸,对复杂形状建模,并生成文件,通过 3D 打印创建物理对象。

With the updated Deep Think, you can turn a sketch into a 3D-printable reality. Deep Think analyzes the drawing, models the complex shape and generates a file to create the physical object with 3D printing.

面向 Google AI Ultra 订阅用户和 Gemini API

Available to Google AI Ultra Subscribers and the Gemini API

Google AI Ultra 订阅用户今天起即可在 Gemini 应用中访问更新后的 Deep Think 模式。科学家、工程师和企业也可以申请我们的早期访问计划,通过 Gemini API 测试 Deep Think。

Google AI Ultra subscribers will be able to access the updated Deep Think mode starting today in the Gemini app. Scientists, engineers and enterprises can also now express interest in our early access program to test Deep Think via the Gemini API.

我们迫不及待想看到你们的发现。

We can't wait to see what you discover.

📎 原文链接

Google Blog 原文

@GoogleDeepMind 推文线程

声明:内容取材于网络,仅代表作者观点,如有内容违规问题,请联系处理。 
more
"龙虾"给AIoT的启示:机械臂有灵魂了,传感器变技能了,MES可以扔了
AI抢饭碗!Meta被曝拟裁员20%:1.58万人面临失业;3·15晚会曝光AI大模型被投毒,给AI投毒已成产业链;王兴呼吁美团内部减少「登味」
OpenAI在ChatGPT中推出应用集成,支持用户直连第三方服务
这届 CEO,开始被 AI 淘汰了
UniPat AI开源SWE-Vision:五百行代码打造SOTA视觉智能体!
xAI创始团队仅剩两人,马斯克全面重组以应对AI竞争压力
华虹集团7nm工艺取得突破 国产AI芯片制造再添新力量
5位华人联创全走,创始12人只剩3人,马斯克xAI梦碎?
Rivian创始人RJ Scaringe再推新创企Mind Robotics,专注工业级AI机器人
看到AWE2026上的冰箱,就像看到了AI新物种
Copyright © 2025 成都区角科技有限公司
蜀ICP备2025143415号-1
  
川公网安备51015602001305号