Research · Round 1

Format & Architecture for the Mesh

A digest of four reference sources on HTML artifacts, LLM wikis, and format decision-making — synthesized to surface the key considerations for how the Mesh should be built.

Sources

Four references were reviewed for this round of research, all focused on the question of format and presentation for agent-maintained knowledge bases.

@trq212

The Unreasonable Effectiveness of HTML

Why HTML outperforms markdown as an agent output format, and when to use it.

@omarsar0

LLM Wikis + HTML Artifacts

Two-layer model: wikis capture context, artifacts present and act on it.

@the_smart_ape

MD vs HTML: The 3-Question Framework

Stop picking sides — let audience, lifecycle, and horizon decide the format.

thariqs.github.io ↗

HTML Effectiveness — Example Gallery

20 self-contained HTML artifacts across 9 categories demonstrating the format in practice.

Source Digests

Thariq (@trq212) — HTML as the agent output format

The core argument: markdown made sense as a human-edited communication layer, but as agents become the primary editors and humans become the primary readers, HTML is strictly better. It carries more information per file — tables, SVG illustrations, interactive elements, spatial layouts, code snippets — in ways markdown simply can't express.

The practical use cases are concrete: specs with embedded mockups, code review diffs with inline annotations, design prototypes with tunable sliders, research reports synthesised from multiple sources, and purpose-built throwaway editors that export back to prompts.

The honest downsides acknowledged: HTML takes 2–4× longer to generate, diffs are noisy and hard to review in version control, and it's a publication format rather than an iteration format.

Key insight for the Mesh

Claude Code's ability to ingest context from the file system, MCPs (Slack, Linear), git history, and browser makes it uniquely capable of generating HTML artifacts from real project data — not synthetic examples. The Mesh's context is exactly this kind of real data.

@omarsar0 — The two-layer model: LLM Wikis + Artifacts

The most architecturally relevant source for the Mesh. The model is explicit: LLM Wikis capture the important information that lets agents do meaningful work. HTML artifacts sit on top and present that information in ways that enable action.

The wiki is the source layer. Artifacts are the interface layer. They're not the same thing and shouldn't be conflated. The important detail: artifacts are dynamic and can be extended as needs arise, and they're bidirectional — artifacts can talk to agents and agents can talk back to artifacts. This unlocks a class of workflows beyond static documents.

Key insight for the Mesh

The Mesh is the LLM Wiki in this model. The HTML artifacts generated from it are separate, derived, dynamic. This distinction matters for architecture: the Mesh itself doesn't need to be HTML — it needs to be a well-structured knowledge base that artifacts can be generated from.

@the_smart_ape — The 3-question decision framework

The most rigorous treatment of the MD vs HTML question. Rather than advocating for a format, it provides a decision function: ask three questions about every document — who reads it (humans, Claude, or both), how many times it gets edited (once or many), and how long it lives (hours or years). The answers vote, and the format with most votes wins.

The hard numbers that matter:

Markdown tokens

~1,100

800-word doc with 6 sections and code blocks

HTML tokens (styled + SVG)

~3,200

Same document. 3× cost. ×30 reference files = 60k tokens burned.

RAG retrieval degradation

15–25%

HTML markup dilutes semantic signal in embedding vectors.

Two important failure modes identified: markup drift (HTML documents that get edited over 5–10 cycles end up with mismatched spacing systems and color schemes — invisible until you diff version 1 against version 8) and the grep failure (long-lived HTML is not reliably findable with standard tooling because content is embedded in markup).

The recommended pattern for documents that need to serve both audiences: one canonical MD source, multiple HTML views generated on demand. A 10-line script piping MD into Claude with different system prompts. No infrastructure, no lock-in.

Key tension for the Mesh

The Mesh serves both humans and agents. If agents load Mesh content as context, HTML format costs 3× more tokens and degrades retrieval. But if humans browse the Mesh, MD doesn't render well in a browser. These audiences pull in opposite directions — the architecture needs to resolve this deliberately.

HTML Effectiveness Gallery (thariqs.github.io)

20 working HTML artifacts across 9 categories that demonstrate the format in practice: Exploration & Planning, Code Review, Design, Prototyping, Illustrations & Diagrams, Decks, Research & Learning, Reports, and Custom Editing Interfaces.

The key observation from the collection: HTML's effectiveness comes from rendering spatial relationships and interactivity that markdown flattens. Side-by-side comparisons, live toggles, drag-and-drop boards, animation sandboxes, export buttons that close the loop back into Claude — none of these are possible in markdown. The artifacts are executable, not just readable.

Synthesis

What these four sources agree on, and what they don't.

All four sources independently arrive at the same conclusion: HTML and markdown are not competitors, they're layers. The disagreement is only about where the boundary between them sits. @the_smart_ape draws it at document type. @omarsar0 draws it at architectural role (wiki vs artifact). @trq212 draws it at audience (agent output vs agent input).

For the Mesh, these perspectives stack rather than conflict:

The Mesh's core content is reference material that agents re-read → markdown wins (Thariq's rule)
The Mesh is the LLM Wiki layer, not the artifact layer — artifacts are generated from it (omarsar's model)
Mesh content is long-lived, edited over time, needs to survive grep and versioning → markdown wins (the_smart_ape's framework)
But humans browsing the Mesh benefit from rich presentation, navigation, and visual structure → HTML wins for the interface to that content

Format Decision: MD vs HTML vs Hybrid

Markdown-first

Token efficient — 3× cheaper for agent context loading

Best retrieval — cleaner embeddings, 15–25% better RAG relevance

Clean diffs — maintainable version history

Grep-able — findable with standard tooling in 3 years

Survives — no CDN dependencies, no API churn

Flat for humans — hard to navigate large docs, no visual hierarchy

Sharing is painful — browsers don't render MD natively

No spatial information — relationships, diagrams, comparisons don't work

HTML-first

Rich for humans — navigable, visual, shareable

Spatial information — diagrams, comparisons, interactive elements

Executable artifacts — docs can export back to prompts, call agents

Self-contained & shareable — one file, one link

3× token cost — expensive for agent context loading

Noisy diffs — hard to maintain as a living document

Markup drift — degrades silently over multiple edits

Poor retrieval — markup dilutes semantic signal

Fragile long-term — CDN dependencies, API churn

Hybrid (MD source → HTML views)

Best of both — MD for agents, HTML for humans

Source stays clean — version control, grep, longevity preserved

Views are derived — regenerate from source, no drift

Audience-specific — different HTML views for exec, engineers, onboarding

Two things to maintain — source + generation pipeline

Sync risk — HTML views can drift from MD source if not regenerated

More upfront setup — needs generation tooling or prompts

Mesh Content Matrix

Applying the 3-question framework to each section of the Mesh.

Who we are

Humans + agents · Rarely edited · Years

Stable reference; agents re-read it; needs to survive long-term

How we work

Humans + agents · Edited as team evolves · Years

Living handbook; diffs must be readable; agents need it as context

How we present

Agents + humans · Rarely edited · Years

Hybrid

MD source for tokens/retrieval; HTML view for visual token/color reference

How agents operate

Agents primarily · Evolves with tooling · Months–years

Loaded as context every session — token cost matters most here

What we're working on

Humans + agents · Edited constantly · Days–weeks

Hybrid

MD source; HTML artifacts generated per engagement for human visibility

Generated artifacts

Humans, once · Written once · Hours–days

HTML

PR explainers, planning views, reports — one-shots for human consumption

Key Considerations

The open questions and constraints the architecture must address.

1. Agent loading must be token-efficient

The primary agent use case — loading Mesh sections as context at the start of a session — is extremely sensitive to token cost. At 3× overhead, an HTML-first Mesh becomes expensive at scale. This alone makes MD the default for canonical content.

2. Retrieval quality matters if we add search

If the Mesh eventually supports semantic search or RAG (finding the right section for a given task), MD gives meaningfully better embedding quality. Markup noise in HTML vectors degrades relevance by 15–25%. This is a latent but real constraint.

3. Maintenance burden must stay low

The Mesh only works if it stays current. Anything that makes maintenance harder — noisy diffs, markup drift, duplicate formats to keep in sync — increases the risk it goes stale. The format must minimise friction to update, not maximise visual richness.

4. The two audiences need different things

Agents need lean, greppable, structured text. Humans need navigable, visual, shareable documents. A single format that tries to serve both will underserve one. The architecture should acknowledge this explicitly and route each audience to the right layer.

5. Artifacts should be derived, not canonical

Following @omarsar0's model: HTML artifacts generated from Mesh content are presentation and action surfaces — not the source of truth. They should be regenerable from the source, not treated as the thing itself. If an artifact drifts from the source, the source wins.

6. Bidirectional artifacts are worth designing for

@omarsar0's observation that artifacts can talk to agents and vice versa unlocks a class of workflows that pure documentation doesn't support: a "what we're working on" view that an agent can update directly, a planning board that exports back to prompts, a status artifact that refreshes from live data. This is a design space to explore, not close off.

7. The reversibility test is a useful quality gate

Any HTML artifact generated from Mesh content should pass the reversibility test: can the content be extracted back to clean text in one prompt? If not, content has been buried in markup, and the artifact has become unmaintainable. This is a concrete rule that can be enforced at creation time.

Emerging Direction

Not a decision — a hypothesis

This is where the research is pointing, not a finalised architecture. The principles document will test these against the ideals.

The evidence points toward a two-layer architecture:

Layer 1 — The Mesh (MD source of truth): canonical content lives in well-structured markdown files. Optimised for agent loading, retrieval, version control, and longevity. Humans can read it; it's not beautiful, but it's reliable.
Layer 2 — Mesh artifacts (HTML, generated on demand): human-facing views, planning boards, status reports, onboarding guides — generated from Layer 1 by agents for specific purposes. Ephemeral, regenerable, not version-controlled themselves.

This maps to:

@omarsar0's LLM Wiki + artifact model
@the_smart_ape's "one MD source, many HTML renders" pattern
@trq212's observation that Claude Code can synthesise real project context into rich HTML outputs

The remaining open questions for the principles document:

How are Mesh sections structured and chunked for selective agent loading?
What's the right granularity — one file per section, or finer?
How does "what we're working on" stay current — manual updates, agent-driven, or a mix?
How do we prevent the MD source from drifting out of sync with generated artifacts?
What does bidirectional artifact interaction look like concretely for this team?

Counter-model: HTML as the authoring surface

Not resolved — an alternative hypothesis

The model above is what the reference sources point toward. The model below is what the intended workflow points toward. They produce the same output for agents but differ fundamentally on what the canonical file is and who edits it.

Consider what actually gets created when the Mesh is in use: a schema for a data type with its fields and constraints, a breakdown of how a space definition tree is structured, a layout showing how column types map to rendering behaviour. These are not prose documents that happen to have some formatting. They are structured visual assets — the schema, the graph, the tree layout is the deliverable. You look at it, iterate on it with Claude, and the HTML representation is what carries the meaning.

In this workflow, the human is always looking at a rendered HTML document — not a Markdown source file. "Update this section" means updating the HTML. "Add a column to this table" means editing the HTML table. The file you're iterating on is the file that gets stored. There is no generation step, no separate source format, no pipeline between authoring and what you see.

This inverts the two-layer model from the references:

HTML is the canonical file — the thing Claude edits and the human reviews. Version-controlled as HTML. The structured layout, visual hierarchy, and spatial relationships are part of the content, not a presentation layer added afterwards.
Plain text is derived on demand — when an agent needs to load a section into context, the system extracts clean text from the HTML at read-time. Agents get the meaning without the markup. No separate file to maintain, no sync to manage.

The outcome is identical from the agent's perspective — it loads clean text. But the authoring experience and the canonical artifact are completely different. The human-Claude editing loop operates directly on the HTML document, which means the structure, hierarchy, and visual organisation the human sees is exactly what exists in the file.

The tension this introduces: HTML is harder to diff, more fragile across many edit cycles, and the text extraction needs to reliably recover meaning from markup. These are solvable engineering problems — but they need to be designed for explicitly rather than assumed away.