Format & Architecture for the Mesh
A digest of four reference sources on HTML artifacts, LLM wikis, and format decision-making — synthesized to surface the key considerations for how the Mesh should be built.
Sources
Four references were reviewed for this round of research, all focused on the question of format and presentation for agent-maintained knowledge bases.
Source Digests
Thariq (@trq212) — HTML as the agent output format
The core argument: markdown made sense as a human-edited communication layer, but as agents become the primary editors and humans become the primary readers, HTML is strictly better. It carries more information per file — tables, SVG illustrations, interactive elements, spatial layouts, code snippets — in ways markdown simply can't express.
The practical use cases are concrete: specs with embedded mockups, code review diffs with inline annotations, design prototypes with tunable sliders, research reports synthesised from multiple sources, and purpose-built throwaway editors that export back to prompts.
The honest downsides acknowledged: HTML takes 2–4× longer to generate, diffs are noisy and hard to review in version control, and it's a publication format rather than an iteration format.
Claude Code's ability to ingest context from the file system, MCPs (Slack, Linear), git history, and browser makes it uniquely capable of generating HTML artifacts from real project data — not synthetic examples. The Mesh's context is exactly this kind of real data.
@omarsar0 — The two-layer model: LLM Wikis + Artifacts
The most architecturally relevant source for the Mesh. The model is explicit: LLM Wikis capture the important information that lets agents do meaningful work. HTML artifacts sit on top and present that information in ways that enable action.
The wiki is the source layer. Artifacts are the interface layer. They're not the same thing and shouldn't be conflated. The important detail: artifacts are dynamic and can be extended as needs arise, and they're bidirectional — artifacts can talk to agents and agents can talk back to artifacts. This unlocks a class of workflows beyond static documents.
The Mesh is the LLM Wiki in this model. The HTML artifacts generated from it are separate, derived, dynamic. This distinction matters for architecture: the Mesh itself doesn't need to be HTML — it needs to be a well-structured knowledge base that artifacts can be generated from.
@the_smart_ape — The 3-question decision framework
The most rigorous treatment of the MD vs HTML question. Rather than advocating for a format, it provides a decision function: ask three questions about every document — who reads it (humans, Claude, or both), how many times it gets edited (once or many), and how long it lives (hours or years). The answers vote, and the format with most votes wins.
The hard numbers that matter:
Two important failure modes identified: markup drift (HTML documents that get edited over 5–10 cycles end up with mismatched spacing systems and color schemes — invisible until you diff version 1 against version 8) and the grep failure (long-lived HTML is not reliably findable with standard tooling because content is embedded in markup).
The recommended pattern for documents that need to serve both audiences: one canonical MD source, multiple HTML views generated on demand. A 10-line script piping MD into Claude with different system prompts. No infrastructure, no lock-in.
The Mesh serves both humans and agents. If agents load Mesh content as context, HTML format costs 3× more tokens and degrades retrieval. But if humans browse the Mesh, MD doesn't render well in a browser. These audiences pull in opposite directions — the architecture needs to resolve this deliberately.
HTML Effectiveness Gallery (thariqs.github.io)
20 working HTML artifacts across 9 categories that demonstrate the format in practice: Exploration & Planning, Code Review, Design, Prototyping, Illustrations & Diagrams, Decks, Research & Learning, Reports, and Custom Editing Interfaces.
The key observation from the collection: HTML's effectiveness comes from rendering spatial relationships and interactivity that markdown flattens. Side-by-side comparisons, live toggles, drag-and-drop boards, animation sandboxes, export buttons that close the loop back into Claude — none of these are possible in markdown. The artifacts are executable, not just readable.
Synthesis
What these four sources agree on, and what they don't.
All four sources independently arrive at the same conclusion: HTML and markdown are not competitors, they're layers. The disagreement is only about where the boundary between them sits. @the_smart_ape draws it at document type. @omarsar0 draws it at architectural role (wiki vs artifact). @trq212 draws it at audience (agent output vs agent input).
For the Mesh, these perspectives stack rather than conflict:
- The Mesh's core content is reference material that agents re-read → markdown wins (Thariq's rule)
- The Mesh is the LLM Wiki layer, not the artifact layer — artifacts are generated from it (omarsar's model)
- Mesh content is long-lived, edited over time, needs to survive grep and versioning → markdown wins (the_smart_ape's framework)
- But humans browsing the Mesh benefit from rich presentation, navigation, and visual structure → HTML wins for the interface to that content
Format Decision: MD vs HTML vs Hybrid
Mesh Content Matrix
Applying the 3-question framework to each section of the Mesh.
Key Considerations
The open questions and constraints the architecture must address.
1. Agent loading must be token-efficient
The primary agent use case — loading Mesh sections as context at the start of a session — is extremely sensitive to token cost. At 3× overhead, an HTML-first Mesh becomes expensive at scale. This alone makes MD the default for canonical content.
2. Retrieval quality matters if we add search
If the Mesh eventually supports semantic search or RAG (finding the right section for a given task), MD gives meaningfully better embedding quality. Markup noise in HTML vectors degrades relevance by 15–25%. This is a latent but real constraint.
3. Maintenance burden must stay low
The Mesh only works if it stays current. Anything that makes maintenance harder — noisy diffs, markup drift, duplicate formats to keep in sync — increases the risk it goes stale. The format must minimise friction to update, not maximise visual richness.
4. The two audiences need different things
Agents need lean, greppable, structured text. Humans need navigable, visual, shareable documents. A single format that tries to serve both will underserve one. The architecture should acknowledge this explicitly and route each audience to the right layer.
5. Artifacts should be derived, not canonical
Following @omarsar0's model: HTML artifacts generated from Mesh content are presentation and action surfaces — not the source of truth. They should be regenerable from the source, not treated as the thing itself. If an artifact drifts from the source, the source wins.
6. Bidirectional artifacts are worth designing for
@omarsar0's observation that artifacts can talk to agents and vice versa unlocks a class of workflows that pure documentation doesn't support: a "what we're working on" view that an agent can update directly, a planning board that exports back to prompts, a status artifact that refreshes from live data. This is a design space to explore, not close off.
7. The reversibility test is a useful quality gate
Any HTML artifact generated from Mesh content should pass the reversibility test: can the content be extracted back to clean text in one prompt? If not, content has been buried in markup, and the artifact has become unmaintainable. This is a concrete rule that can be enforced at creation time.
Emerging Direction
This is where the research is pointing, not a finalised architecture. The principles document will test these against the ideals.
The evidence points toward a two-layer architecture:
- Layer 1 — The Mesh (MD source of truth): canonical content lives in well-structured markdown files. Optimised for agent loading, retrieval, version control, and longevity. Humans can read it; it's not beautiful, but it's reliable.
- Layer 2 — Mesh artifacts (HTML, generated on demand): human-facing views, planning boards, status reports, onboarding guides — generated from Layer 1 by agents for specific purposes. Ephemeral, regenerable, not version-controlled themselves.
This maps to:
- @omarsar0's LLM Wiki + artifact model
- @the_smart_ape's "one MD source, many HTML renders" pattern
- @trq212's observation that Claude Code can synthesise real project context into rich HTML outputs
The remaining open questions for the principles document:
- How are Mesh sections structured and chunked for selective agent loading?
- What's the right granularity — one file per section, or finer?
- How does "what we're working on" stay current — manual updates, agent-driven, or a mix?
- How do we prevent the MD source from drifting out of sync with generated artifacts?
- What does bidirectional artifact interaction look like concretely for this team?
Counter-model: HTML as the authoring surface
The model above is what the reference sources point toward. The model below is what the intended workflow points toward. They produce the same output for agents but differ fundamentally on what the canonical file is and who edits it.
Consider what actually gets created when the Mesh is in use: a schema for a data type with its fields and constraints, a breakdown of how a space definition tree is structured, a layout showing how column types map to rendering behaviour. These are not prose documents that happen to have some formatting. They are structured visual assets — the schema, the graph, the tree layout is the deliverable. You look at it, iterate on it with Claude, and the HTML representation is what carries the meaning.
In this workflow, the human is always looking at a rendered HTML document — not a Markdown source file. "Update this section" means updating the HTML. "Add a column to this table" means editing the HTML table. The file you're iterating on is the file that gets stored. There is no generation step, no separate source format, no pipeline between authoring and what you see.
This inverts the two-layer model from the references:
- HTML is the canonical file — the thing Claude edits and the human reviews. Version-controlled as HTML. The structured layout, visual hierarchy, and spatial relationships are part of the content, not a presentation layer added afterwards.
- Plain text is derived on demand — when an agent needs to load a section into context, the system extracts clean text from the HTML at read-time. Agents get the meaning without the markup. No separate file to maintain, no sync to manage.
The outcome is identical from the agent's perspective — it loads clean text. But the authoring experience and the canonical artifact are completely different. The human-Claude editing loop operates directly on the HTML document, which means the structure, hierarchy, and visual organisation the human sees is exactly what exists in the file.
The tension this introduces: HTML is harder to diff, more fragile across many edit cycles, and the text extraction needs to reliably recover meaning from markup. These are solvable engineering problems — but they need to be designed for explicitly rather than assumed away.