Introduction

This week brought three AI developments worth your attention.

TL;DR

Agents can now use UIs reliably enough for real work.
Security gets a detect → patch → PR loop, not just linting.
6 GW of GPUs means cheaper, faster AI—if power & cooling keep up.

First, agents learned to operate software interfaces visually—no API required. Second, security got an automated teammate that hunts vulnerabilities and proposes fixes. Third, OpenAI locked in massive compute capacity that will make advanced AI cheaper and more accessible.

I’ll explain what happened, why it matters, and what you can do with it. No fluff. Just the useful bits.

1. Google launches Gemini 2.5 “Computer Use”

Released: Oct 7, 2025 (preview) [1]

Google released a Gemini 2.5 capability that actually uses computers the way you and I do. It sees the screen, clicks buttons, fills forms, scrolls pages, and completes multi-step tasks with safety rails. Google reports state-of-the-art results on browser/mobile UI control and is making it available via the Gemini API. [1]

Is this truly new?

Concept: not new—OpenAI showed a “computer-using agent”/Operator earlier in Jan 2025. [2, 3]
What’s new now: Google’s public preview focused on browser control, with benchmarks and an API path. [1]

Scope differences (this week): Google’s preview targets browser actions (no broad OS/file access), whereas OpenAI has showcased agents with a broader virtual computer concept. [1, 2, 3, 4]

New API access + better reported benchmark scores make this practical for teams who struggled with brittle RPA/DOM scripts. [1]

RPA = Robotic Process Automation.

In plain English: it’s software that mimics what a person does on a computer—clicking buttons, filling forms, copying data between apps—to automate repetitive, rule-based tasks. No physical robots; just “screen robots” (scripts/bots).

What the “Brittle code” looks like:

# clicks the first button in the third column... until layout changes
page.click("//div[3]//button[1]")

Less brittle:

# stable, semantic hook: data attributes / ARIA roles
page.click("[data-action='checkout']")         # your app adds this
# or
page.get_by_role("button", name="Checkout")

Most automation breaks when the website changes. Vision-based agents adapt like humans do. That’s the difference between brittle scripts and robust helpers.

Implementation Strategy: Visual Agents

Builder Action	Implementation Detail
Add Stable UX Hooks	Inject `data-action="pay"` or `data-role="primary-cta"` onto key buttons to provide reliable semantic selection for the vision model.
Keep Agents On-Rails	Strictly enforce allow-list domains and hard step caps (e.g., maximum of 12 steps per execution loop) to prevent runaway processes.
Enforce Idempotency	Log all state-mutating actions with idempotency keys at the API level to absolutely prevent double-purchases or duplicate records if the agent retries.

Read Google DeepMind

2. DeepMind unveils CodeMender

Published: Oct 2025 (blog & early results) [5]

What happened. DeepMind introduced CodeMender, an AI agent that hunts for bugs and fixes them automatically. It combines fuzzing, static analysis, differential testing, and LLM reasoning to spot vulnerabilities and propose patches. In early trials it submitted dozens of fixes to real OSS projects (with human review). [5]

This goes beyond “AI code suggestions.” It’s continuous security maintenance: detect risky patterns → propose fixes → open PRs → harden codebases over time.

Example.
Unsafe buffer handling in an image library is flagged; the agent proposes a safe rewrite, runs tests, then opens a PR with a clear diff and rationale.

Security debt compounds silently. An agent that finds and fixes vulnerabilities continuously? That’s not just helpful—it’s necessary.

Implementation Strategy: Automated Security

Builder Action	Implementation Detail
Baseline Core Libraries	Begin deployment strictly on your top 3 internal libraries. Baseline your MTTR (mean time to repair) prior to implementation to measure hard ROI.
Enforce Human Gates	Strictly require human code-review and passing smoke tests on all auto-patch Pull Requests before merging. Do not allow autonomous merging to `main`.
Track Security Metrics	Quantify the agent’s value by tracking “vulns prevented / 1k LOC changed” on a monthly rolling basis.

Read Google DeepMind

3. OpenAI and AMD: 6 gigawatts of AI compute

Announced: Oct 2025 (multi-year partnership; first 1 GW planned for 2H 2026 with MI450) [6]

What happened. OpenAI and AMD signed a deal for up to 6 GW of AMD Instinct GPUs. It’s one of the largest AI compute build-outs announced to date, with milestone-linked warrants. [6]

Compute capacity is oxygen for AI. More capacity → longer training runs, better multimodal models, and cheaper inference—if power and cooling keep pace.

What this means for you.

Expect faster rollouts of long-context, tool-using agents with planning and memory.
Fewer waitlists and downward pressure on API prices as capacity comes online.
But timelines will depend on siting, power, and networking readiness.

Computing power isn’t the bottleneck anymore—**infrastructure** is. The best AI in the world is useless if you can’t power it.

Read AMD

The Weekly AI Delta

Category	Previous State	New Development
Computer Use	The capability existed primarily in closed concepts (OpenAI Operator).	Google launched a broad public preview via Vertex with published benchmarks and an API path.
Security Agents	Static linters and passive LLM autocomplete suggestions.	CodeMender introduces an integrated detect → patch → PR loop validated against real OSS repositories. [5]
Compute Capacity	Ambiguous hyperscale build-outs.	OpenAI explicitly committed to 6 GW of capacity mapped to a concrete AMD MI450 timeline. [6]

Quick comparison: Google vs. OpenAI (computer-using agents)

Capability	Google Gemini 2.5 Computer Use	OpenAI Operator (concept)
Primary scope	Browser UI actions	Virtual computer + broader flows
Input signal	Visual/DOM + prompts	Visual/DOM + OS sandbox
Access model	API/Vertex preview	Limited demos/announcements
Guardrails focus	Step caps, allow-lists	Sandboxed VM + human reviews
Best fit (today)	Web workflows with flaky DOM	End-to-end app simulations
Maturity (this week)	New public preview	Earlier concept, evolving

Citations: [1, 2, 3 , 4]

Risk Mitigation

Domain	Limitation / Gotcha	Strategic Mitigation
Visual Agents	Cookie banners, captchas, MFA logic, and legal consent flows break autonomous agents.	These edge cases require manual product-level design, hardcoded overrides, or API bypasses.
CodeMender	Auto-generated security patches can inadvertently regress application performance.	You must run rigorous performance benchmarks in CI/CD alongside your standard security checks.
Compute Scale	Pledged 6 GW capacity does not immediately equal API availability.	Assume grid constraints and facility cooling delays will heavily dictate when token prices actually drop.

Conclusion

So what changed this week? Agents got hands. Security got smarter. Compute got bigger.

Google’s computer-use model means automation can work wherever humans work—legacy systems, government portals, clunky interfaces—without waiting for APIs.[1] DeepMind’s CodeMender shifts security from reactive firefighting to proactive maintenance. [5]
AMD’s 6-gigawatt deal with OpenAI signals more capacity and lower costs—if the infrastructure keeps pace. [6]

What to do now: Pilot visual agents in a safe sandbox, try security automation on your riskiest code, and design your stack for multi-provider LLM backends.

The tools are coming. Be ready to use them and have fun :)

Safety, Agents, and Compute

Introduction

1. Google launches Gemini 2.5 “Computer Use”

Implementation Strategy: Visual Agents

2. DeepMind unveils CodeMender

Implementation Strategy: Automated Security

3. OpenAI and AMD: 6 gigawatts of AI compute

The Weekly AI Delta

Quick comparison: Google vs. OpenAI (computer-using agents)

Risk Mitigation

Conclusion

References

References

Citation

Safety, Agents, and Compute

Introduction

1. Google launches Gemini 2.5 “Computer Use”

Implementation Strategy: Visual Agents

2. DeepMind unveils CodeMender

Implementation Strategy: Automated Security

3. OpenAI and AMD: 6 gigawatts of AI compute

The Weekly AI Delta

Quick comparison: Google vs. OpenAI (computer-using agents)

Risk Mitigation

Conclusion

References

Enjoyed this? Get more like it.

References

Citation

Learn AI and Python without the hype