← Journal

Semantic HTML Is the New SEO: Why Agencies Are Rebuilding the Web

·6 min read

Open the source of most marketing sites and you'll find a forest of div and span tags. It renders fine. It also tells crawlers and AI agents nothing about what's actually on the page. That's the bug semantic HTML fixes — and why a small wave of agencies are rebuilding sites around it.

What semantic HTML actually means

It's the difference between a styled div and an h1. Between a card div and an article. Between a wall of divs and a clear outline: header, nav, main, section, article, aside, footer.

Those tags are not decoration. They're the API your site exposes to every non-human reader: screen readers, Google, ChatGPT, Perplexity. When the tags match the meaning, machines understand the page. When they don't, machines guess — and guesses don't get cited.

Why this is back

For a decade, semantic HTML was a nice-to-have because Google was good enough at inferring structure from CSS classes. AI agents are not. They tokenize raw HTML and look for landmarks. A page built on h1 → h2 → p is a clean signal. A page built on styled divs is noise.

The sites that get cited by ChatGPT, Perplexity, and Claude share a common trait: a parser can extract the structure without rendering. Semantic HTML is what makes that possible.

The semantic checklist

  • One h1 per page, matching the page topic
  • Headings nest correctly — no skipping from h1 to h4
  • main wraps the unique page content
  • Lists are ul or ol, not divs with bullets
  • Articles use article with a clear h2 headline
  • Forms have labels tied to inputs by for/id
  • Images have meaningful alt text — or empty alt if decorative

What an agency does differently

A semantic-first agency builds the document outline before the visual design. The wireframe is an HTML tree, not a Figma frame. Tailwind classes go on top of the right element, not in place of it. The result reads as well in a browser's reader mode as it does in production.

This discipline is what separates an "agent-native" build from a pretty template. Both look fine. Only one gets cited.

The compounding return

Semantic HTML pays off three times: accessibility scores rise, organic search picks up, and AI agents start naming you in answers. None of those are line items. All of them are leverage.