Elena' s AI Blog

Elena Daehnhardt

Midjourney AI-generated art
Image credit: Illustration created with Midjourney, prompt by the author.
Image prompt

“An illustration representing cloud computing”

_# GEO & AI-Search Optimisation — Project Spec for edaehn.github.io

Goal: increase AI-search traffic and get cited as a reputable source by AI assistants (ChatGPT, Perplexity, Gemini, Google AI Overviews) while preserving human readability, existing SEO, the author’s voice, and all affiliate links.

This site is already well-optimised. This project fills specific gaps; it does not rebuild what exists. Work is split into three tracks because site-wide and per-post changes have different blast radius, validation, and cadence.


0. Folder access & global guardrails

Whole repo is accessible. Regardless of track, always:

  • Preserve voice. Do not rewrite sentences that already read well or are GEO-compliant.
  • Never change title, date, layout, image, thumb_image, published, or any affiliate frontmatter/links. Do not remove images or external URLs.
  • Commit hygiene. One reviewable commit per logical change; never stage unrelated files.
  • Git lock quirk: this repo’s .git blocks unlink. Before add/commit, clear locks with rename: for L in index.lock HEAD.lock; do [ -e ".git/$L" ] && mv ".git/$L" ".git/$L.stale.$$"; done. Commits succeed despite an “unable to unlink HEAD.lock” warning. Filter noise with grep -v tmp_obj.

1. Current state — what already exists (do NOT rebuild)

Verified in the repo as of this spec:

  • Schema: _includes/meta_property.html emits per-post BlogPosting JSON-LD with author Person (sameAs: Twitter, GitHub, LinkedIn), publisher, image, datePublished/dateModified, keywords. _includes/faq_schema.html auto-emits FAQPage from the faq: frontmatter.
  • TL;DR: rendered via _includes/tldr.html (already on the page).
  • Series: /series/<slug>/ hub pages + prev/next nav via series.html, series-nav.html, series_list.html, related_from_series.html, linking_series_post.html; _data/series_upcoming.yml.
  • Author entity: _data/authors.yml (Elena + guests), author_bio.html, author_contact.html.
  • Internal linking: _scripts/deep_linking/ + _data/deep_linking.yml + _data/tag_taxonomy.yml.
  • Crawl/SEO files: Jekyll-generated sitemap.xml (respects noindex: true), robots.txt, feed.xml.
  • Per-post GEO pass: the existing scheduled task geo-optimize-posts (SKILL.md) handles anchors→IDs, heading rewrites, entity definitions, anti-patterns, and faq: blocks. ~40 oldest posts are done (ai_optimised: true).

Genuine gaps (the actual scope of this project)

  • No HowTo schema (high value for step-by-step posts: git, Flask, Docker, QR-code, OpenCV).
  • No BreadcrumbList schema.
  • No llms.txt.
  • robots.txt does not explicitly address AI crawlers; has Crawl-delay: 10 (may throttle them).
  • Author Person in schema is hard-coded in meta_property.html rather than driven by authors.yml, and lacks ORCID/Google-Scholar sameAs; no ProfilePage schema on /about.
  • Per-post: prose comparisons not yet tabularised; entity-name inconsistency (chatGPT/typos); image alt gaps; aged facts in older posts (GPT-3.5, Bard, deprecated APIs).

2. Track A — Site-wide infrastructure (MANUAL · reviewed · build-tested)

May edit: _includes/, _layouts/, _data/, and root files (robots.txt, llms.txt, new pages). Cadence: a handful of discrete, individually reviewed commits up front. Mandatory validation for every change: run a local Jekyll build (bundle exec jekyll build), confirm no Liquid/YAML errors, and spot-check one rendered post before committing. Never batch.

Tasks (each = its own commit + review):

  1. HowTo schema include. Create _includes/howto_schema.html mirroring the faq_schema.html pattern: emit HowTo JSON-LD from a howto: frontmatter list (name + ordered steps). Wire it into meta_property.html behind ``). (This unblocks Track B item B5.)
  2. BreadcrumbList schema include (Home → Series → Post), emitted from existing series data.
  3. llms.txt at site root: short, curated map of canonical hub/pillar pages and series for AI crawlers. Keep it hand-maintained, not auto-dumped.
  4. robots.txt decision (needs Elena’s call): explicitly handle GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot. For GEO you generally want to allow citation crawlers; reconsider Crawl-delay: 10. Document the decision; do not change unilaterally.
  5. Author entity enrichment: drive the schema Person from _data/authors.yml; add ORCID / Google-Scholar to sameAs; add ProfilePage + Person JSON-LD to /about.
  6. (Optional) TechArticle type for code tutorials instead of generic BlogPosting.

3. Track B — Per-post optimisation (AUTOMATABLE · batch · low-risk)

May edit: ONLY files in _posts/ (frontmatter + body). Never _includes/_layouts/_scripts/_data. Cadence: ongoing batches (≤ ~10 posts/run), like the current task. Validation: YAML parses; 0 raw <a name> anchors; no duplicate heading IDs; correct faq: counts; never touch protected frontmatter fields.

This track = the existing geo-optimize-posts SKILL, kept as-is, plus these additions:

  • B1. Keep current rules (anchors→{#id}, query-aligned headings, entity definitions, anti-pattern removal, judgment-based faq:, ai_optimised: true).
  • B2. Comparison tables: convert prose/list comparisons to Markdown tables (e.g. chatbot alternatives, tool round-ups, X-vs-Y posts).
  • B3. Entity-name consistency: normalise chatGPTChatGPT, fix Midjouney etc.
  • B4. Image alt audit: ensure every in-body <img> has descriptive alt.
  • B5. howto: frontmatter on step-by-step posts — depends on Track A item 1 shipping first.
  • B6. (Guided, NOT fully automated) freshness/accuracy: flag aged facts for review; only correct when unambiguous, because these edits can change meaning.

Keep B6 out of unattended runs. Everything else here is safe to automate.


4. Track C — Measurement & maintenance (PERIODIC)

  • GA4: add a segment/report for AI-assistant referrals (chatgpt.com, perplexity.ai, gemini, etc.); leverage the existing _scripts/dashboard/analytics_dashboard/.
  • Citation watch: periodically query target questions in the major AI tools; log whether the blog is cited. (Could become a scheduled task.)
  • Search Console: check which queries map to which posts; feed gaps back into Track B headings/FAQ.
  • Re-run Track B on newly published / not-yet-ai_optimised posts.

5. Sequencing & dependencies

  1. Track A #1 (HowTo include) → unblocks B5.
  2. Track A is mostly independent of Track B; B1–B4 can run in parallel with A.
  3. Do Track A changes before large Track B batches if a batch would otherwise rely on new frontmatter conventions.
  4. Track C starts once any optimised content is live, and runs on a cadence.

6. Definition of done

  • Track A change: Jekyll build clean, schema validates (Rich Results Test), one post visually checked, single-purpose commit.
  • Track B batch: all target posts ai_optimised: true, YAML valid, 0 raw anchors, no dup IDs, FAQ only on procedural/technical posts, voice preserved, committed.
  • Project: all published posts ai_optimised; HowTo/Breadcrumb/llms.txt live; AI-referral tracking in place.

7. How to describe this when opening the project (paste as project instructions)

Optimise edaehn.github.io for AI search (GEO) and SEO while preserving voice, human readability, and affiliate links. Work in three tracks: (A) site-wide infrastructure — schema includes, llms.txt, robots/author entity — edited manually in _includes/_layouts/_data/root, one reviewed + Jekyll-build-tested commit at a time, never batched; (B) per-post optimisation — the existing geo-optimize-posts rules plus tables, entity-name consistency, image alt, and howto: frontmatter — editing ONLY _posts/, batchable/automatable; (C) measurement — AI-referral analytics and citation monitoring. Never change title/date/layout/image/published or affiliate fields. Respect the repo’s git-lock workaround (use mv, not rm). See GEO_PROJECT.md for the full inventory, gaps, and definition of done._


8. Checklist (status as of this spec)

Already done ✅

  • BlogPosting + author Person schema (per post)
  • Auto FAQPage schema from faq: frontmatter
  • Visible TL;DR on every post (tldr.html)
  • Series hub pages /series/<slug>/ + prev/next nav
  • Author entity data (_data/authors.yml) + bio includes
  • Internal deep-linking system (_scripts/deep_linking + data)
  • sitemap.xml (noindex-aware), robots.txt, feed.xml
  • Canonical, OG, Twitter cards, AI-friendly robots meta
  • Per-post GEO pass on 95 / 230 published posts (ai_optimised: true)

Track A — site-wide (todo)

  • HowTo schema include (_includes/howto_schema.html) + wire into meta_property.html ← do first, unblocks B5
  • BreadcrumbList schema include
  • llms.txt at site root
  • robots.txt AI-crawler decision (GPTBot/ClaudeBot/PerplexityBot/Google-Extended; revisit Crawl-delay) — needs Elena’s call
  • Drive Person schema from authors.yml; add ORCID/Scholar sameAs; ProfilePage on /about
  • (Optional) TechArticle type for code tutorials

Track B — per-post (todo)

  • Continue GEO pass on remaining 135 published posts (next: 2024-05 onward)
  • B2 — convert prose comparisons → Markdown tables
  • B3 — entity-name consistency (chatGPTChatGPT, typos)
  • B4 — image alt-text audit
  • B5 — add howto: frontmatter to step-by-step posts (after Track A #1)
  • B6 — guided freshness/accuracy fixes on aged posts (not automated)

Track C — measurement (todo)

  • GA4 AI-referral segment/report
  • AI citation monitoring (optionally a scheduled task)
  • Search Console query→post gap review
All Posts