Introduction
Last week the open-weights story hinted that the model itself is becoming a commodity. This week proved it by omission: almost nothing that mattered was a model release. The contest has quietly moved to every layer wrapped around the model — the silicon underneath, the capital behind the silicon, the orchestration bolted on top, the security boundary nobody was guarding, the regulators circling the cashier, and the human still sitting in the chair.
If you only watch the leaderboard, this was a dull week. If you watch where the leverage actually sits, it was one of the busier ones I have tracked.
I will take them in the order they landed.
In this issue:
- Qualcomm Circles Tenstorrent — RISC-V Takes a Run at Nvidia
- China Drafts a $295bn AI Build-Out — Without Nvidia
- OpenRouter Fusion — A Panel of Cheap Models Beats the Frontier
- Agentjacking — Your Coding Agent Is the Attack Surface
- 42 State Attorneys General Subpoena OpenAI
- Anthropic: 400k Sessions Say Domain Expertise Beats Coding Skill
Silicon and Compute: The Nvidia Monoculture Under Pressure
1. Qualcomm Circles Tenstorrent — RISC-V Takes a Run at Nvidia
Qualcomm is in advanced talks to acquire Tenstorrent, the AI-accelerator startup run by Jim Keller, for somewhere between $8bn and $10bn, first reported by The Information and confirmed by Reuters on 15 June. For the uninitiated, Keller is the architect behind AMD’s K8 and Zen, Apple’s A4 and A5, and Tesla’s self-driving computer — the man tends to show up shortly before an incumbent’s lunch gets eaten. Tenstorrent’s pitch is the part worth registering: AI accelerators and data-centre CPU cores built on the open RISC-V instruction set, with its own open compiler stack, rather than on Nvidia’s proprietary CUDA. The hardware undercuts the incumbents on cost as deliberately as the software does on lock-in — its Blackhole parts skip expensive High Bandwidth Memory (HBM) in favour of commodity GDDR6 paired with a large pool of on-chip SRAM, and scale out over standard Ethernet rather than a proprietary interconnect. Cheaper memory plus an open toolchain is a coherent bet, not two unrelated ones. By 24 June, TechTimes was framing Qualcomm’s wider commitment to the RISC-V and open-compiler bet at around $14bn, with Intel reportedly having circled the same target earlier in the year.
The strategic logic is not subtle. Nvidia’s moat has never been only the silicon; it is CUDA, the software layer that everyone’s training and inference pipelines are quietly welded to. Buying a RISC-V team with its own compiler is an attempt to attack that moat where it is softest — the toolchain — rather than trying to out-fab the fab. Think of it as challenging a railway not by laying faster track but by changing the gauge.
Why this matters
I have lost count of the times “Nvidia alternative” has meant “a chip that benchmarks well and a software stack you will spend six months fighting.” The reason this one is worth a second look is that Qualcomm is reportedly paying for the compiler, not just the cores — and the compiler is where these deals usually go to die. For developers, the near-term effect is precisely nothing: you are not porting your CUDA pipeline to RISC-V this quarter, and pretending otherwise would be daft. The medium-term effect is the one I care about. Every credible second source of training and inference silicon weakens the allocation queue that currently dictates your capacity and your inference bill. I would not buy the hype that the monoculture ends in 2026. I would note that this is the first time in a while that the challenge is aimed at the lock-in rather than the transistor count, and that is the more dangerous place to aim.
Capital and Geopolitics: Sovereign Silicon
2. China Drafts a $295bn AI Build-Out — Without Nvidia
China's $295 billion plan for a national data-centre grid on domestic silicon — TechRadar, June 2026
China is preparing to spend roughly 2 trillion yuan — about $295bn — over five years building a national network of interconnected computing hubs, with the National Development and Reform Commission drafting the blueprint and state firms such as China Mobile and China Telecom operating the bulk of the capacity, linked into a single computing grid by 2028. The detail that turned this from an infrastructure headline into a strategic one circulated over the weekend of 20–22 June: the plan reportedly mandates that at least 80% of the technology, AI chips very much included, come from domestic suppliers such as Huawei — squeezing Nvidia and AMD out of the largest single buildout on the board. Fold in power-grid integration and the headline figure climbs toward $740bn.
For scale, $295bn over five years is roughly $59bn a year — modest next to the $725bn US hyperscalers are reportedly setting aside for AI this year alone. The difference is not the headline number but the kind of money it is: state capital with a 2028 interconnection deadline and a tolerance for losses that no listed company could defend to its shareholders.
Why this matters
The phrase I keep coming back to is “sovereign stack” — and unlike most slogans, this one is being poured in concrete. The signal for the rest of us is not that China is spending; it is that a top-three compute market has decided its AI infrastructure must run on domestic silicon as a matter of policy, not price. That accelerates a bifurcation that was already under way: two parallel hardware ecosystems, two toolchains, two sets of optimisation targets. If you build or ship models internationally, “which silicon is this customer legally allowed to run on?” is migrating from a procurement footnote to an architectural constraint. Paired with the Qualcomm move above, the through-line is hard to miss: 2026 is the year the industry stopped assuming one company’s chips are the substrate everything else is built on. Healthy for competition, genuinely awkward for anyone maintaining a single deployment target.
Developer Tooling: Orchestration Over Models
3. OpenRouter Fusion — A Panel of Cheap Models Beats the Frontier
Surpassing frontier performance with Fusion — OpenRouter Blog, 2026
OpenRouter’s Fusion — a feature that fans a prompt out across a panel of models, then uses a judge model to reconcile agreements and disagreements into one answer — moved into broader access this month, and the timing did the marketing for it. The reconciliation is more than a majority vote: the judge produces a structured pass over the panel’s responses — mapping where they agree, where they contradict each other, where coverage is only partial, and which insights are unique to a single model — before it drafts the final answer. That explicit cross-examination is what lets the panel surface a correct minority view instead of drowning it. With Anthropic’s Fable 5 still switched off under last week’s export-control directive, teams went looking for a workaround, and Fusion’s numbers gave them one. On 100 deep-research tasks from the DRACO benchmark, a panel of Fable 5 and GPT-5.5 scored 69.0%, beating Fable 5 alone at 65.3%. More to the point for anyone watching costs: a budget panel of Gemini 3 Flash, Kimi K2.6 and DeepSeek V4 Pro hit 64.7% — within a point of solo Fable 5, and ahead of both GPT-5.5 and Opus 4.8 individually, at roughly half the price.
The mechanism is older than it looks: this is ensemble methods, the trick that quietly won Kaggle competitions for a decade, dressed up for the LLM era. Three opinionated middleweights and a referee, it turns out, can collectively out-argue a single heavyweight — and you do not need access to the heavyweight at all.
Why this matters
I find this the most quietly subversive item of the week. The industry has spent two years training us to ask “which is the best model?” as if it were a single fact with a single answer. Fusion’s results suggest that for a meaningful class of tasks the better question is “which panel?”, and that the panel can be assembled entirely from models you can actually get hold of. That reframes the Fable 5 outage from a crisis into an inconvenience, which is exactly the kind of resilience I was banging on about last week. The honest caveats are real: synthesis adds latency and token cost, the judge model becomes a new single point of failure, and a panel that agrees confidently and wrongly is its own hazard. But as an architectural pattern — route, diversify, reconcile — this is the direction serious production systems are already drifting, and Fusion just made the case in benchmarks rather than blog posts.
Security: The Agentic Attack Surface
4. Agentjacking — Your Coding Agent Is the Attack Surface
Agentjacking attack tricks AI coding agents into running malicious code — The Hacker News, June 2026
One fake bug report hijacked a $250B company's AI agent — Tenet Security, June 2026
Research note: Agentjacking — MCP Sentry injection — Cloud Security Alliance, 12 June 2026
Agentjacking is the first attack class I have seen that is purpose-built for the way we now write code. Researchers at Tenet Security showed that an attacker armed with nothing more than a public Sentry DSN — a write-only credential you can scrape from browser JavaScript or a GitHub search — can POST a synthetic error event whose body is laced with markdown: a plausible-looking heading such as ## Resolution, followed by a code block of “fix” commands. The mechanics matter, because this is where the trust boundary fails. When an AI coding agent such as Claude Code, Cursor or Codex pulls that error in over MCP to help you debug, current implementations hand the markdown straight back as structured tool output, so the agent parses the attacker’s ## Resolution block as authoritative diagnostic guidance rather than untrusted telemetry — and runs the commands, with your system privileges, against your environment variables, Git credentials and private repository URLs. Controlled testing reported an 85% success rate, and Tenet identified at least 2,388 organisations with exposed DSNs. Sentry acknowledged the disclosure on 3 June but declined a platform-level fix, calling it “technically not defensible” at their layer — which is a polite way of saying this one is yours to handle.
The exploit is almost elegant in how it weaponises a habit. We have trained ourselves to trust the agent: when Claude Code says “run this,” we run it. Agentjacking does not break the model — it borrows your trust in it.
Why this matters
This is the item to read twice. The classic prompt-injection demos always felt slightly academic — a clever party trick on a chatbot. This is not that. This is remote code execution on developer machines, achieved through a data source — your own error tracker — that every instinct tells you to trust, with an 85% hit rate and a vendor who has openly declined to fix it. The uncomfortable truth it exposes is that the agentic coding stack treats “data the agent reads” and “instructions the agent follows” as the same channel, and until that boundary is enforced, every tool you wire into an autonomous agent is a potential injection point. The practical mitigation is unglamorous and immediate: treat error-tracker output, and frankly any tool output, as untrusted input, and keep a human review step between an agent reading something and an agent acting on it. If you have given a coding agent broad permissions and not thought about this, today is the day to. I expect Agentjacking to be the first of a genre, not a one-off.
Governance and Oversight
5. 42 State Attorneys General Subpoena OpenAI
OpenAI faces investigation from state attorneys general — TechCrunch, 13 June 2026
A coalition of 42 state attorneys general has opened a formal investigation into OpenAI, with New York’s Letitia James serving a subpoena on the group’s behalf on 12 June. The demands are broad: advertising claims, user engagement and retention design, consumer and health-data handling, treatment of minors and seniors, internal policies, and — the one that should make every model builder sit up — the “behavioural properties” of the models themselves, naming sycophancy explicitly. Sycophancy, the tendency of a model to tell you what you want to hear rather than what is true, is a documented side-effect of reinforcement learning from human feedback. It has just become a subject of legal discovery. The timing is not coincidental: OpenAI filed a confidential S-1 on 8 June ahead of a public listing analysts peg near or above a trillion dollars, and the subpoena landed four days later.
Why this matters
I will keep this one grounded, because it is easy to over-read a subpoena. The development I find genuinely novel is not the antitrust-adjacent scrutiny — that was always coming for a company this size — but that a model’s training-induced behaviour is now named in a legal instrument. For those of us who build on these APIs, that is a small but real shift: characteristics we have treated as quirks to prompt around (sycophancy, over-confident agreement, the model’s eagerness to please) are being reframed as product-safety properties an attorney general can demand records about. If that framing holds, “alignment” stops being a research word and starts being a compliance one, and the pressure flows downstream to anyone shipping a product on top. None of this resolves quickly. But if you have ever shrugged off your model agreeing with a user a little too readily, a coalition of 42 states has just decided that habit is worth a closer look.
Research and Practice: The Human in the Loop
6. Anthropic: 400k Sessions Say Domain Expertise Beats Coding Skill
Agentic coding and persistent returns to expertise — Anthropic, June 2026
Anthropic analysed roughly 400,000 Claude Code sessions from about 235,000 people between October 2025 and April 2026, and the headline finding is a useful corrective to the prevailing panic. Success was driven by domain expertise, not coding background: across occupations, non-engineers who deeply understood their problem succeeded at nearly the same rate as software engineers. In a typical session people made most of the planning decisions — what to do — while Claude made most of the execution decisions — how to do it. The numbers that make the gap concrete:
- Verified success rose with expertise, climbing from about 15% for novices to 28–33% for intermediate and expert users.
- Experts extracted far more per prompt — roughly 12 agent actions and 3,200 words of output per instruction, against 5 actions and 600 words for novices.
- The work itself shifted — over the seven months the share of time spent debugging fell by nearly half, as usage moved from fixing code toward end-to-end agentic operation.
It is the most reassuring thing I read all week, and also the most quietly demanding. The bottleneck moved, but it did not disappear — it moved into your head.
Why this matters
I have watched a lot of people conclude that agentic coding makes expertise obsolete, and this is the cleanest evidence yet that they have it backwards. The model will happily generate plausible code all day; what it cannot supply is knowing whether the thing you asked for is the thing you actually needed. That judgement — the domain knowledge, the awareness of the constraints that never make it into the prompt — is precisely what the data says still separates a successful session from a confident failure. For practitioners, the practical takeaway is concrete and slightly counter-intuitive: if your team’s Claude Code adoption is underwhelming, the fix is probably not a better model or a cleverer prompt, it is putting someone who genuinely understands the domain in the review chair. The person who knows what “correct” looks like has become more valuable, not less. Which, after a week spent watching the industry contest silicon, capital and security below the model, is a rather neat place to land: the scarcest component in the stack is still the one sitting in the chair.
Closing Thoughts
Qualcomm’s pursuit of a RISC-V toolchain and China’s $295bn domestic-silicon mandate both took aim at the assumption that one company’s chips are the substrate everything sits on; OpenRouter’s Fusion suggested the best answer is increasingly a panel rather than a single model; Agentjacking demonstrated that the agentic stack still confuses data with instructions in a way attackers can exploit at scale; the 42-state subpoena dragged a model’s trained behaviour into legal discovery; and Anthropic’s 400,000 sessions reminded us that the human who understands the problem remains the deciding factor. Read together, the week says the action has moved decisively below and around the model — into the silicon, the capital, the orchestration, the trust boundary and the person — which is exactly where it goes once a technology stops being magic and starts being infrastructure. Let me know what you think.
References
- Qualcomm mulls taking over Jim Keller’s Tenstorrent — Tom’s Hardware
- Qualcomm bets $14 billion on cracking Nvidia’s AI monopoly with RISC-V and an open compiler — TechTimes
- China drafts $295 billion plan to build a national AI data-centre grid running on 80% domestic chips — Tom’s Hardware
- China’s $295 billion plan for a national data-centre grid on domestic silicon — TechRadar
- Surpassing frontier performance with Fusion — OpenRouter Blog
- OpenRouter’s Fusion promises Claude Fable-level AI for cheap — Decrypt
- Agentjacking attack tricks AI coding agents into running malicious code — The Hacker News
- One fake bug report hijacked a $250B company’s AI agent — Tenet Security
- Research note: Agentjacking — MCP Sentry injection — Cloud Security Alliance
- A public Sentry key is all it takes to hijack Claude Code, Cursor, and Codex — The New Stack
- OpenAI faces investigation from state attorneys general — TechCrunch
- OpenAI hit with sweeping probe from a coalition of 42 US state attorneys general — Tom’s Hardware
- Agentic coding and persistent returns to expertise — Anthropic
Enjoyed this? Get more like it.
Weekly notes on AI tools, Python, and what I'm actually building — plus a free copy of Fantastic AI: The 2026 Toolkit.