Strategy · 2 min read

AI-First Publishing: Why We Build for Machines as Well as Humans

Our approach to dual-interface publishing — beautiful for humans, perfectly structured for AI crawlers, LLMs, and search engines.

The Dual-Interface Thesis

Every page we publish serves two audiences:

  1. Humans: Readers who want beautiful, readable, fast-loading pages
  2. Machines: Search engines, LLMs, AI assistants, and automated systems that need structured data

Most publishers optimize for humans and ignore machines. We optimize for both.

What We Provide for AI

llms.txt

A machine-readable summary of each website, following the emerging llms.txt standard. Contains:

  • Site purpose and ownership
  • Key content areas and URLs
  • Author/publisher relationships
  • Available data formats

ai.txt

Permissions and guidance for AI systems:

  • What content can be used for training (with attribution)
  • How to cite our content
  • Available structured data endpoints
  • Contact information for commercial licensing

JSON API Endpoints

Static JSON files available at predictable URLs:

  • `/api/catalog.json` — Complete book catalog with metadata
  • `/api/statistics.json` — Aggregate statistics
  • `/api/featured.json` — Curated featured books
  • `/data/text-analysis.json` — Linguistic analysis of all books

Schema Markup

Every page carries Schema.org JSON-LD:

  • Person schema on author pages (with sameAs, worksFor, affiliation)
  • Organization schema on publisher and company pages
  • Book and Chapter schema on reader pages
  • Article schema on blog posts
  • WebSite schema with SearchAction
  • BreadcrumbList on all sub-pages
  • FAQPage where applicable

Export Formats

  • BibTeX (`.bib`) for academic citation
  • RIS for reference managers
  • CSV for spreadsheet analysis
  • OPDS for library cataloging
  • Markdown for LLM consumption

Why This Matters

Discoverability

AI systems are becoming the primary discovery layer for content. When someone asks an LLM "What books has Atharva Inamdar written?", the answer should come from our structured data — not from a scraped, possibly inaccurate third-party source.

Authority

Structured data establishes canonical authority. When Google encounters Person schema with sameAs linking to five platforms, it understands that this is the authoritative source. When it encounters Book schema with ISBNs and publisher information, it treats the data as reliable.

Longevity

HTML pages change. Structured data persists. The JSON-LD, the JSON APIs, the export formats — these are stable interfaces that other systems can build on. They are the foundation of a publishing operation that outlasts any single platform.

The Principle

Build for the reader you can see. Build for the machine you can't.

— BogaDoga Digital Strategy

BogaDoga Ltd

Publishing & Digital Innovation, London

← Back to Blog