Introduction
This week brought three AI developments worth your attention.
TL;DR
- Agents can now use UIs reliably enough for real work.
- Security gets a detect → patch → PR loop, not just linting.
- 6 GW of GPUs means cheaper, faster AI—if power & cooling keep up.
First, agents learned to operate software interfaces visually—no API required. Second, security got an automated teammate that hunts vulnerabilities and proposes fixes. Third, OpenAI locked in massive compute capacity that will make advanced AI cheaper and more accessible.
I’ll explain what happened, why it matters, and what you can do with it. No fluff. Just the useful bits.
1. Google launches Gemini 2.5 “Computer Use”
Released: Oct 7, 2025 (preview) [1]
Google released a Gemini 2.5 capability that actually uses computers the way you and I do. It sees the screen, clicks buttons, fills forms, scrolls pages, and completes multi-step tasks with safety rails. Google reports state-of-the-art results on browser/mobile UI control and is making it available via the Gemini API. [1]
Is this truly new?
- Concept: not new—OpenAI showed a “computer-using agent”/Operator earlier in Jan 2025. [2, 3]
- What’s new now: Google’s public preview focused on browser control, with benchmarks and an API path. [1]
Scope differences (this week): Google’s preview targets browser actions (no broad OS/file access), whereas OpenAI has showcased agents with a broader virtual computer concept. [1, 2, 3, 4 ]
New API access + better reported benchmark scores make this practical for teams who struggled with brittle RPA/DOM scripts. [1]
RPA = Robotic Process Automation.
In plain English: it’s software that mimics what a person does on a computer—clicking buttons, filling forms, copying data between apps—to automate repetitive, rule-based tasks. No physical robots; just “screen robots” (scripts/bots).
What the “Brittle code” looks like:
# clicks the first button in the third column... until layout changes
page.click("//div[3]//button[1]")
Less brittle:
# stable, semantic hook: data attributes / ARIA roles
page.click("[data-action='checkout']") # your app adds this
# or
page.get_by_role("button", name="Checkout")
Most automation breaks when the website changes. Vision-based agents adapt like humans do. That’s the difference between brittle scripts and robust helpers.
Action for builders
- Add stable UX hooks:
data-action="pay"/data-role="primary-cta"on key buttons for reliable selection. - Keep agents on-rails: allow-list domains and step caps (e.g., 12 steps).
- Log actions with idempotency keys to prevent double-purchases.
2. DeepMind unveils CodeMender
Published: Oct 2025 (blog & early results) [5]
What happened. DeepMind introduced CodeMender, an AI agent that hunts for bugs and fixes them automatically. It combines fuzzing, static analysis, differential testing, and LLM reasoning to spot vulnerabilities and propose patches. In early trials it submitted dozens of fixes to real OSS projects (with human review). [5]
This goes beyond “AI code suggestions.” It’s continuous security maintenance: detect risky patterns → propose fixes → open PRs → harden codebases over time.
Example.
Unsafe buffer handling in an image library is flagged; the agent proposes a safe rewrite, runs tests, then opens a PR with a clear diff and rationale.
Security debt compounds silently. An agent that finds and fixes vulnerabilities continuously? That’s not just helpful—it’s necessary.
Action for builders
- Start with your top 3 internal libraries; baseline MTTR (mean time to repair) and measure improvements.
- Require human review + smoke tests on all auto-patch PRs.
- Track “vulns prevented / 1k LOC changed” monthly.
3. OpenAI and AMD: 6 gigawatts of AI compute
Announced: Oct 2025 (multi-year partnership; first 1 GW planned for 2H 2026 with MI450) [6]
What happened. OpenAI and AMD signed a deal for up to 6 GW of AMD Instinct GPUs. It’s one of the largest AI compute build-outs announced to date, with milestone-linked warrants. [6]
Compute capacity is oxygen for AI. More capacity → longer training runs, better multimodal models, and cheaper inference—if power and cooling keep pace.
What this means for you.
- Expect faster rollouts of long-context, tool-using agents with planning and memory.
- Fewer waitlists and downward pressure on API prices as capacity comes online.
- But timelines will depend on siting, power, and networking readiness.
Computing power isn’t the bottleneck anymore—**infrastructure** is. The best AI in the world is useless if you can’t power it.
What changed (this week vs. before)
- Computer Use: The capability existed (OpenAI Operator, Jan 2025). New: Google’s broader public preview + benchmarks + API path. [1–4]
- Security agents: Linters and LLM suggestions existed. New: an integrated detect → patch → PR loop validated on real OSS. [5]
- Compute: Hyperscale build-outs are ongoing. New: the size (6 GW) and explicit MI450 timeline. [6]
Quick comparison: Google vs. OpenAI (computer-using agents)
| Capability | Google Gemini 2.5 Computer Use | OpenAI Operator (concept) |
|---|---|---|
| Primary scope | Browser UI actions | Virtual computer + broader flows |
| Input signal | Visual/DOM + prompts | Visual/DOM + OS sandbox |
| Access model | API/Vertex preview | Limited demos/announcements |
| Guardrails focus | Step caps, allow-lists | Sandboxed VM + human reviews |
| Best fit (today) | Web workflows with flaky DOM | End-to-end app simulations |
| Maturity (this week) | New public preview | Earlier concept, evolving |
Limits & gotchas
- Agents: cookie banners, captchas, MFA, and legal consent flows still need product-level design and explicit handling.
- CodeMender: patches can regress performance; keep perf benchmarks in CI alongside security checks.
- Compute: capacity ≠ availability; grid constraints and cooling determine how fast tokens actually get cheaper.
Conclusion
So what changed this week? Agents got hands. Security got smarter. Compute got bigger.
Google’s computer-use model means automation can work wherever humans work—legacy systems, government portals, clunky interfaces—without waiting for APIs.[1]
DeepMind’s CodeMender shifts security from reactive firefighting to proactive maintenance. [5]
AMD’s 6-gigawatt deal with OpenAI signals more capacity and lower costs—if the infrastructure keeps pace. [6]
What to do now: Pilot visual agents in a safe sandbox, try security automation on your riskiest code, and design your stack for multi-provider LLM backends.
The tools are coming. Be ready to use them and have fun :)
Did you like this post? Please let me know if you have any comments or suggestions.
Posts about AI that might be interesting for youReferences
- Google — Introducing the Gemini 2.5 Computer Use model
- OpenAI — Computer-Using Agent (announcement page)
- OpenAI — Introducing Operator
- VentureBeat — Google’s AI can now surf the web, click buttons, and fill out forms
- Google DeepMind — Introducing CodeMender: an AI agent for code security
- AMD Investor Relations — AMD and OpenAI announce strategic partnership to deploy 6 GW of AMD GPUs