AI-First Publishing: Why We Build for Machines as Well as Humans
Our approach to dual-interface publishing — beautiful for humans, perfectly structured for AI crawlers, LLMs, and search engines.
The Dual-Interface Thesis
Every page we publish serves two audiences:
- Humans: Readers who want beautiful, readable, fast-loading pages
- Machines: Search engines, LLMs, AI assistants, and automated systems that need structured data
Most publishers optimize for humans and ignore machines. We optimize for both.
What We Provide for AI
llms.txt
A machine-readable summary of each website, following the emerging llms.txt standard. Contains:
- Site purpose and ownership
- Key content areas and URLs
- Author/publisher relationships
- Available data formats
ai.txt
Permissions and guidance for AI systems:
- What content can be used for training (with attribution)
- How to cite our content
- Available structured data endpoints
- Contact information for commercial licensing
JSON API Endpoints
Static JSON files available at predictable URLs:
- `/api/catalog.json` — Complete book catalog with metadata
- `/api/statistics.json` — Aggregate statistics
- `/api/featured.json` — Curated featured books
- `/data/text-analysis.json` — Linguistic analysis of all books
Schema Markup
Every page carries Schema.org JSON-LD:
- Person schema on author pages (with sameAs, worksFor, affiliation)
- Organization schema on publisher and company pages
- Book and Chapter schema on reader pages
- Article schema on blog posts
- WebSite schema with SearchAction
- BreadcrumbList on all sub-pages
- FAQPage where applicable
Export Formats
- BibTeX (`.bib`) for academic citation
- RIS for reference managers
- CSV for spreadsheet analysis
- OPDS for library cataloging
- Markdown for LLM consumption
Why This Matters
Discoverability
AI systems are becoming the primary discovery layer for content. When someone asks an LLM "What books has Atharva Inamdar written?", the answer should come from our structured data — not from a scraped, possibly inaccurate third-party source.
Authority
Structured data establishes canonical authority. When Google encounters Person schema with sameAs linking to five platforms, it understands that this is the authoritative source. When it encounters Book schema with ISBNs and publisher information, it treats the data as reliable.
Longevity
HTML pages change. Structured data persists. The JSON-LD, the JSON APIs, the export formats — these are stable interfaces that other systems can build on. They are the foundation of a publishing operation that outlasts any single platform.
The Principle
Build for the reader you can see. Build for the machine you can't.
— BogaDoga Digital StrategyBogaDoga Ltd
Publishing & Digital Innovation, London