Project Glasswing 与 Claude Mythos Preview，Meta Muse Spark，Gemini CLI v0.37.0

2026年4月5日至9日这一周由两项重大公告主导：Anthropic 发布了 Project Glasswing，这是一个围绕 Claude Mythos Preview 的由十一家大型科技公司组成的联盟，旨在大规模检测零日漏洞；与此同时，Meta 带着 Muse Spark 回归，这是其一年以来首个非 open-weights 模型。与此同时，Google、OpenAI、GitHub 和 Perplexity 也分别为其开发者和用户发布了值得注意的更新。

Project Glasswing 与 Claude Mythos Preview

2026年4月7日 — Anthropic 宣布了 Project Glasswing，这是一个软件安全倡议，汇聚了十一家组织：Amazon Web Services、Apple、Broadcom、Cisco、CrowdStrike、Google、JPMorganChase、The Linux Foundation、Microsoft、NVIDIA 和 Palo Alto Networks。该倡议依托一个受限访问的新 frontier 模型能力：Claude Mythos Preview。

Mythos Preview 的作用

该模型已展示出识别主流操作系统和 web 浏览器中数千个零日漏洞的能力，其中一些漏洞已潜伏了数十年。三个具体例子说明了其发现范围：

OpenBSD 中一个存在 27 年 的漏洞，可让任何联网机器被远程崩溃
FFmpeg 中一个存在 16 年 的漏洞，隐藏在一行被测试了五百多万次却始终未被发现的代码里
Linux 内核中的多个漏洞，可导致权限提升直至完全控制机器

这些例子表明，Mythos Preview 并非仅做表层检测——它能找出深藏在关键代码库中的逻辑错误，而这些代码库多年来一直被成千上万名研究人员持续审计。

Benchmark

Mythos Preview 在基准测试上的表现相较 Opus 4.6 有明显提升：

Benchmark	Mythos Preview	Opus 4.6
SWE-bench Verified	93.9 %	80.8 %
SWE-bench Pro	77.8 %	53.4 %
Terminal-Bench 2.0	82.0 %	65.4 %
SWE-bench Multilingue	87.3 %	77.8 %
CyberGym（网络安全）	83.1 %	66.6 %
GPQA Diamond	94.6 %	91.3 %
Humanity’s Last Exam（无工具）	56.8 %	40.0 %

SWE-bench Verified 上从 80.8 % 跃升到 93.9 % 尤其值得注意：这是衡量模型解决真实开源仓库中真实 bug 能力最常用的 benchmark。对于专注网络安全的 CyberGym 来说，超过 16 个百分点的提升使 Mythos Preview 在进攻与防御安全任务上处于独特层级。

财务承诺与治理

Anthropic 承诺为项目合作伙伴提供高达 1 亿美元 的使用额度，同时向开源安全组织直接捐赠 400 万美元：

通过 Linux Foundation 向 Alpha-Omega 和 OpenSSF 捐赠 250 万美元
向 Apache Software Foundation 捐赠 150 万美元

这种级别的财务投入表明，Anthropic 将 Glasswing 定位为长期倡议，而非单纯的传播型合作。关于已修复漏洞的报告将在 90 天 内发布。

可用性与定价

Mythos Preview 目前不会立即向公众开放。初始信用额度用完后，该模型将向参与者开放，价格为 25 美元/125 美元每百万 token（输入/输出），可通过 Claude API、Amazon Bedrock、Google Cloud Vertex AI 和 Microsoft Foundry 使用。

有关已发现漏洞和利用方式的详细技术报告可在 Anthropic 的 Red Team 博客上查看，完整 system card 也已发布在 anthropic.com 上。

“This project represents a watershed moment for AI-assisted cybersecurity — not because of what Claude can do today, but because of what it will be able to do as capabilities continue to scale.”

🇨🇳 这个项目代表了 AI 辅助网络安全的一个分水岭时刻——不是因为 Claude 今天能做什么，而是因为随着能力持续扩展，它将能够做什么。 — Anthropic 官方公告

🔗 Project Glasswing · System Card Mythos Preview · Red Team 报告

Meta Muse Spark：Meta 携封闭模型回归

2026年4月8日 — Meta 宣布了 Muse Spark，这是由 Meta Superintelligence Labs (MSL) 开发的新 “Muse” 家族中的首个模型——MSL 是一个专注于先进 AI 研究的新内部实体。这是 Meta 自 2025 年 4 月的 Llama 4 以来发布的首个模型，也是一年沉寂后的回归，尤其是 Meta 首个不以 open-weights 形式提供的模型。

能力与定位

Muse Spark 被定位为迈向“个人超级智能”（personal superintelligence）的一步。该模型覆盖多个领域，并具备先进能力：

领域	描述
多模态	高级视觉感知与理解，跨视觉信息整合
推理	分步推理，先思考再回答（test-time reasoning）
健康	医学影像分析，个性化饮食建议
代理式	面向复杂任务的 agent 能力
Contemplating mode	多个并行推理 agent 的编排（逐步部署）

该模型集成了“思维压缩”（thought compression）以优化推理 token，并通过多个并行 agent 支持 test-time scaling。仍在逐步部署中的 Contemplating mode 是最受期待的功能之一：它允许将多个 agent 分配到同一问题上并行处理，每个 agent 独立推理后再进行综合。

性能

据 Artificial Analysis（2026年4月8日）称，Muse Spark 在 Artificial Analysis Intelligence Index 上得分 52，位列全球前四，落后于 Gemini 3.1 Pro、GPT-5.4 和 Claude Opus 4.6。对于 Meta 而言，这是一次引人注目的回归，直接重返顶级 frontier 模型的竞争行列。

架构

Meta 为 Muse Spark 描述了三个 scaling 方向：

预训练：在九个月内对整个 stack 进行全面重构，并改进架构和数据
强化学习（Reinforcement Learning）：放大预训练后的能力扩展
Test-time reasoning：通过 agent 并行化，在不增加延迟的情况下进行更长的推理

安全与可用性

Meta 表示已通过其 Advanced AI Scaling Framework v2 进行了深入评估。Apollo Research 对一个发布前 checkpoint 进行了第三方评估，并在高风险领域（生物学等）观察到稳健的拒绝行为。

Muse Spark 自 4 月 8 日起可在 meta.ai 和 Meta AI 应用中使用。API 访问目前仅向精选合作伙伴提供私有 preview —— 暂无立即向公众开放。

🔗 Meta AI 博客 — Muse Spark · 公告推文 · Artificial Analysis Benchmark

Anthropic：基础设施与 agents

Google + Broadcom 合作 — 自 2027 年起部署数吉瓦 TPU

2026年4月6日 — Anthropic 宣布与 Google 和 Broadcom 达成协议，获得数吉瓦的新一代 TPU 计算能力，计划自 2027 年起投入使用。这是 Anthropic 历史上最大的基础设施承诺。

增长背景意义重大：

年化收入（run-rate revenue）已超过 300 亿美元，而 2025 年底约为 90 亿美元
超过 1,000 家企业客户 每家年支出超过一百万美元，而 2026 年 2 月时为 500 多家——不到两个月翻了一倍

“This groundbreaking partnership with Google and Broadcom is a continuation of our disciplined approach to scaling infrastructure: we are building the capacity necessary to serve the exponential growth we have seen in our customer base while also enabling Claude to define the frontier of AI development.”

🇨🇳 这项与 Google 和 Broadcom 的突破性合作，是我们以纪律化方式扩展基础设施的延续：我们正在构建必要的产能，以满足客户群指数级增长的需求，同时也让 Claude 能够定义 AI 发展的前沿。 — Anthropic 首席财务官 Krishna Rao

新计算资源的大部分将部署在美国，这延续了其在 2025 年 11 月承诺向美国基础设施投资 500 亿美元的计划。Claude 仍然是唯一同时可在三大云平台上使用的 frontier 模型：AWS Bedrock、Google Cloud Vertex AI 和 Microsoft Azure Foundry。

🔗 合作公告

Managed Agents —— 解耦架构，延迟降低 90 %

2026年4月8日 — Anthropic 的 Engineering Blog 发布了一篇技术文章，详细介绍了 Managed Agents 的架构，这是在 Claude 平台上运行长期 agent 的托管服务。

核心思想是将大脑（Claude 及其 harness）与手（执行 sandbox、工具）以及会话（事件日志）解耦。每个组件都成为独立、可替换、可单独扩展的接口。

解耦后的实测结果：

指标	改善
p50 TTFT（首 token 时间）	-60 %
p95 TTFT	-90 %

这种解耦还解决了两个安全问题：凭据隔离（OAuth token 永远无法从代码执行 sandbox 中访问）以及韧性（如果 harness 发生故障，可从最后一个会话事件重新启动一个新实例，且不会丢失上下文）。

🔗 扩展 Managed Agents

Google Gemini

Gemini CLI v0.37.0 —— 动态 Sandbox、Chapters 和持久化 Browser

2026年4月8日 — Gemini CLI 的 v0.37.0 版本为开发者工作流引入了三项改进：

功能	描述
Dynamic Sandbox Expansion	动态扩展 sandbox + 支持 Linux 和 Windows 的 worktree
Chapters（Narrative Flow）	将工具按主题分组为“章节”，以获得更好的会话结构
Advanced Browser Capabilities	持久化的 browser 会话以及 browser agent 中工具的动态发现

Chapters 功能为长会话带来了叙事连续性：每组操作形成一个具有自身逻辑的“章节”，从而更便于跟踪和恢复复杂会话。browser agent 的持久性也增强了——会话在调用之间保持活跃，且可用工具会被动态发现。

🔗 Gemini CLI 更新日志

Gemini App 中的交互式模拟与 3D 模型

2026年4月9日 — Gemini App 现在可以直接在聊天中将复杂概念转化为交互式可视化。用户可以实时调整物理参数（速度、重力、质量），并查看其对可运行模拟的影响——月球轨道、分子旋转、动态系统。全球范围内可通过 prompt 栏中的 Pro 模型使用。暂不向 Education 和 Workspace 账户提供。

🔗 Gemini App 中的 3D 模拟

Gemini Notebooks — 与 NotebookLM 同步

2026年4月8日 — Google 在 Gemini App 中推出 Notebooks：可持久保存的工作区，可在 Gemini App 与 NotebookLM 之间同步，用于复杂项目。notebook 允许组织对话、自定义指令和文件（文档、PDF）。在 Gemini App 中添加的来源会自动出现在 NotebookLM 中，反之亦然。本周起向 Google AI Ultra、Pro 和 Plus 订阅用户在 web 端提供。移动端和免费访问即将推出。

🔗 Gemini Notebooks + NotebookLM

OpenAI

新 Pro 档位：100 美元/月 — Codex 用量提升 5 倍

2026年4月9日 — OpenAI 推出一个新的 Pro 档位，价格为 100 美元/月，介于 Plus 档位（约 20 美元/月）与现有的 200 美元/月 Pro 档位之间。它提供的 Codex 用量是 Plus 档位的五倍，专为长时间、高强度会话设计。

档位	价格	Codex 用量
Plus	~20 美元/月	Standard
Pro（新）	100 美元/月	5× Plus
Pro（现有）	200 美元/月	Maximum

与此同时，OpenAI 将现有 200 美元/月订阅用户的 Codex 2x 用量促销延长至 2026年5月31日，并重置他们的速率限制。

“企业 AI 的下一阶段”——Denise Dresser 的笔记

2026年4月8日 — 首席收入官（Chief Revenue Officer）Denise Dresser 在任职 90 天后发布总结。要点：企业业务如今已占 超过 40% 的收入（到 2026 年底有望与消费业务持平），Codex 的周活跃用户超过 300 万（自 2026 年初起增长 5 倍），ChatGPT 的周活跃用户达到 9 亿。OpenAI 提出了两个战略方向：OpenAI Frontier（跨企业系统的 agent）以及面向团队的统一 AI superapp。

🔗 企业 AI 的下一阶段

Safety Fellowship 与 Child Safety Blueprint

2026年4月6日至8日 — OpenAI 宣布了两项安全倡议。OpenAI Safety Fellowship（4月6日）是一项面向外部研究者的项目，开放申请至 2026 年 5 月 3 日，以算力资源形式提供资助，用于评估、鲁棒性与 agent 监督方面的研究——时间安排为 2026 年 9 月 14 日至 2027 年 2 月 5 日。Child Safety Blueprint（4月8日）提出了一个借助 AI 打击儿童剥削的框架，由 NCMEC、Thorn 和 Attorney General Alliance 共同制定，围绕三个方向展开：现代化针对 AI 生成 CSAM 内容的法律、改进举报机制，以及从设计阶段就融入安全（safety-by-design）。

🔗 Safety Fellowship · Child Safety Blueprint

GitHub Copilot

从终端进行 OWASP Top 10 安全扫描 9 April 2026 — GitHub Copilot CLI integrates an automated security workflow directly from the terminal. In just a few commands, developers can run a full scan of their repository, map the results to OWASP Top 10 categories, and automatically open GitHub issues for each vulnerability detected — without leaving the CLI environment. A direct complement to Project Glasswing for teams that already use GitHub tooling.

“Rubber Duck” agent — automated review

8 April 2026 — The GitHub Research team releases a “Rubber Duck” agent for the Copilot CLI. Inspired by the classic debugging technique (explaining your code out loud to find problems), the agent automatically analyzes submitted code and produces a structured review directly in the terminal. Experimental but officially reposted by @github.

🔗 Tweet Copilot CLI OWASP · Tweet Rubber Duck

Perplexity

Plaid integration — personal finances in Perplexity Computer

9 April 2026 — Perplexity launches an integration with Plaid, allowing users to link their bank accounts, credit cards, and loans directly in Perplexity Computer. Access is read-only — the data does not pass through Perplexity’s servers. The Plaid network covers more than 12,000 institutions (Chase, Fidelity, Vanguard, Robinhood, etc.). Use cases: net worth calculation, budget tracking, debt repayment planner, retirement projection. Available on desktop in the United States and Canada.

Tier	功能
Standard	Link portfolio, basic Portfolio access
Pro / Max	Advanced analytics, interactive dashboards

Billion Dollar Build — startup competition

8 April 2026 — Perplexity launches the “Billion Dollar Build,” an 8-week competition where teams use Perplexity Computer to build a company on a path toward a $1 billion valuation. Finalist rewards: up to **$ 1 million** in investment from the Perplexity Fund + up to $1 million in Computer credits.

🔗 Plaid Integration Blog · Tweet Billion Dollar Build

Agents et outils

Manus integrates into Slack — three modes

6 April 2026 — Manus (now affiliated with Meta) launches a complete suite of Slack integrations built around three modes: a DM agent with persistent memory for personal tasks, an in channels @manus mention for team tasks (no persistent memory — each thread is a new task), and an MCP connector to automate reports and summaries on your behalf from manus.im. Available on paid Slack plans.

🔗 Manus for Slack Blog

Genspark AI Workspace 4.0 — Claw Desktop, Office plugins

8 April 2026 — Genspark launches version 4.0 of its AI workspace with four components: Claw for Desktop (Computer Use and Browser Use to control the computer), Microsoft Office plugins for PowerPoint, Excel, and Word, Speakly (real-time translation and meeting note-taking), and Advanced Workflows on a new OpenCode engine.

🔗 Genspark Blog

Générative media and hardware

Stability AI Brand Studio — creative platform for brands

8 April 2026 — Stability AI launches Brand Studio, a complete creative production platform designed for enterprise marketing teams. At the heart of the system is the Brand Central Hub: Brand ID models trained on a brand’s visual elements (photographic style, palette, patterns, logo placement). Producer Mode turns a description into a structured production plan and executes it automatically step by step. Curated Model Routing intelligently selects the most suitable model among Stability AI and third-party offerings (including Seedream and Nano Banana). Enterprise features: SSO, role-based access controls, approval workflows. Launch partner: the creative agency Huge. Availability: Core plan (free trial) + Enterprise plan.

🔗 Brand Studio by Stability AI

NVIDIA — National Robotics Week

9 April 2026 — For National Robotics Week, NVIDIA publishes a resource article on its Physical AI technologies: NVIDIA Cosmos (world foundation models), Isaac Sim (simulation), Jetson lineup (edge AI), Nemotron and NemoClaw (open source). No new hardware announcement — an educational overview of NVIDIA’s robotics ecosystem for developers.

🔗 NVIDIA Robotics Week

Claude Code — v2.1.94 / v2.1.96 / v2.1.97 updates

Three new releases published during the week.

Version	Date	Key points
v2.1.94	5-6 Apr.	Bedrock support powered by Mantle, default `high` effort for API-key/Bedrock/Enterprise, compact display of Slack MCP links
v2.1.96	7 Apr.	Bedrock regression fix: `403 "Authorization header is missing"` error with `AWS_BEARER_TOKEN_BEDROCK`
v2.1.97	8-9 Apr.	`Ctrl+O` focus view toggle in NO_FLICKER mode, `refreshInterval` parameter for status line, `● N running` indicator in `/agents`, Cedar syntax highlighting

v2.1.94 also introduces a notable behavior change: skills plugins declared via "skills": ["./"] now use the name field from the frontmatter rather than the directory name. v2.1.97 fixes several Bash permission issues (environment variable prefixes, network redirects) and a bug where permission rules whose name matched a JavaScript prototype property (toString, etc.) were silently ignored in settings.json.

🔗 Claude Code CHANGELOG

Ce que ça signifie

The week of 5 to 9 April 2026 marks an acceleration in two intersecting directions. On one side, Anthropic is moving beyond a consumer-product logic and into a critical infrastructure logic: Project Glasswing and the Google/Broadcom partnership signal that Anthropic is positioning itself as a provider of AI capabilities at the level of the global tech ecosystem, and not just as a competitor in the benchmark race. The $100 million commitment in credits and$ 30 billion in annualized revenue reinforces that reading.

On the other side, Meta makes its return with Muse Spark by breaking with its open-weights policy. This is a significant strategic shift: Meta chooses to compete in the closed frontier segment rather than maintain its open source positioning. The creation of Meta Superintelligence Labs and the first non-Llama model signal a deep reorientation of the group’s AI strategy.

For developers, the week is dense but coherent: Gemini CLI gains persistence and structure, GitHub Copilot expands its security scope, Perplexity pushes toward personal data, and Claude Code continues its rapid update cycle.

Sources

本文件已使用 gpt-5.4-mini 模型从 fr 版本翻译为 zh 语言。有关翻译流程的更多信息，请参阅 https://gitlab.com/jls42/ai-powered-markdown-translator