Extending the Markdown Pipeline
How to add a new Markdown / MDX feature to zfb — when a directive is enough, when to write a Rust visitor, and when the AST converter itself needs a new arm.
ℹ️ What this page covers
The engine-side extension surface for zfb-content’s Markdown
pipeline. When you want a new syntax-level feature — admonition
variants, heading rewrites, custom code-block treatment, link
resolvers — this page tells you where to put it and how to wire it
in.
The pipeline that turns Markdown / MDX into HTML lives in
crates/zfb-content. It is plugin-shaped: parse to mdast, run mdast
visitors, convert to hast, run hast visitors, serialize. There is no
runtime / userland plugin loader — every visitor compiles into the
binary — but the surface for adding a new visitor in-tree is small
and stable, and that is where almost every “I want a new Markdown
feature” change lives.
If your feature is just a new directive name (:::callout,
::youtube, :badge), you do not need this page — the directive
registry handles it without touching Rust beyond a one-line
register call. See Custom Directives
for that path.
Decision: directive, visitor, or AST arm
| You want to … | Path | Where it goes |
|---|---|---|
Add :::name / ::name / :name syntax that compiles to a JSX component | Register a directive | Pipeline::with_defaults + DirectiveRegistry (Custom Directives) |
Rewrite existing AST nodes (slugify headings, wrap code blocks, swap <img> for a JSX marker, normalise links) | Write a visitor | New file under crates/ |
| Surface a Markdown construct that markdown-rs parses but zfb currently drops (tables, footnotes, math, definitions, reference-style links) | Extend the AST converter | mdast_to_hast in crates/ |
| Add brand-new Markdown syntax that markdown-rs cannot parse | Out of scope | Upstream change against the markdown crate, then a converter arm in zfb |
Most contributions land in the second row — a fresh visitor.
The two-phase pipeline
The pipeline runs two distinct passes over two different ASTs:
- Parse the input into mdast (
markdown::mdast::Node) using markdown-rs with MDX-aware options. - mdast visitors —
MdastVisitorimplementations rewrite the markdown AST in place. Run first because some transforms only make sense before HTML structure exists. The directive registry is a mdast visitor: it folds runs of paragraphs delimited by:::name/:::into a single MDX JSX element, which is much easier on mdast than after<p>tags have appeared. - Convert mdast → hast via
mdast_to_hast. hast is zfb’s minimal HTML AST (HastNode). - hast visitors —
HastVisitorimplementations rewrite the HTML AST in place. Most rewrites that target HTML element structure (heading anchors,<figure>wrappers,<img>→ JSX markers, syntax highlighting) live here. - Serialize hast to an HTML string in
zfb_content::serializer.
Pick the phase that matches what you are operating on. Rule of
thumb: if you need to look at the original Markdown structure (a
directive, a paragraph run, a particular link reference style),
mdast. If you need to look at HTML element structure (a <pre>, a
heading level, an <img> with a width attribute), hast.
Visitor trait shape
Both visitor traits are intentionally small — one method, called once on a node, mutate in place:
pub trait MdastVisitor {
fn visit(&mut self, node: &mut MdastNode);
}
pub trait HastVisitor {
fn visit(&mut self, node: &mut HastNode);
}
The pipeline calls visit exactly once, with the root node.
Recursion is the visitor’s responsibility — there is no auto-walk.
A typical hast visitor looks like:
use crate::pipeline::{HastNode, HastVisitor};
pub struct MyPlugin;
impl HastVisitor for MyPlugin {
fn visit(&mut self, node: &mut HastNode) {
match node {
HastNode::Root { children }
| HastNode::Element { children, .. } => {
for child in children {
// mutate `child` here, then recurse
self.visit(child);
}
}
_ => {}
}
}
}
Visitors can carry state (per-document slug counters, configuration
options, references to a shared resource). HeadingLinksPlugin keeps
a HashMap<String, usize> for github-slugger-equivalent dedup;
SyntectPlugin holds an Arc<Highlighter> so the syntax theme is
shared across all code blocks in the build.
When to add a Core vs. Opt-in feature
Both live as Rust visitors, but where you wire them depends on how many consumers need them.
Core (wire into Pipeline::with_defaults) when the behaviour is
universal: every content-collection consumer would want it the same way,
with no legitimate reason to opt out. Examples: HeadingLinksPlugin,
CodeTitlePlugin, SyntectPlugin.
Opt-in (wire into Pipeline::with_defaults_and_features) when the
feature is valuable but not universally needed, or when it requires
project-specific config (a source map, a feature flag, custom options).
Examples: all 13 features in zfb-md-extras.
The promotion threshold follows the three-consumer rule: don’t extract until the same pattern has been written by hand in three different zfb consumer projects. One project’s convenience is a recipe.
Where files go
Core plugins live under crates/. Opt-in
features live under crates/. The convention is
one file per feature:
crates/zfb-content/src/plugins/
├── cjk_friendly.rs
├── code_title.rs
├── directives.rs
├── external_links.rs
├── heading_links.rs
├── resolve_links.rs
├── strip_md_ext.rs
├── syntect_plugin.rs
├── toc.rs # heading-marker TOC (wired via features)
└── util/
crates/zfb-md-extras/src/
├── admonitions_preset.rs
├── code_enrichment.rs
├── code_tabs.rs
├── github_alerts.rs
├── github_autolinks.rs
├── heading_marker_toc.rs
├── image_dimensions.rs
├── link_validation.rs
├── mermaid.rs
├── reading_time.rs
├── ruby.rs
├── toc_export.rs
└── transclude.rs
For Core plugins, add your file and re-export from
crates/:
// in plugins.rs
pub mod my_plugin;
pub use my_plugin::MyPlugin;
For Opt-in features, add your file and expose the feature from
crates/, gated on the corresponding
MarkdownConfig::features flag.
Tests typically live in a #[cfg(test)] mod tests {} block alongside
the plugin, with cross-plugin integration cases in
crates/. The existing
tests/ is the reference shape.
Wiring into the default pipeline
Pipeline::with_defaults() is the project-wide default plugin chain.
Adding your visitor there means every caller that uses the defaults
picks it up automatically. Append it in the right phase:
// in crates/zfb-content/src/pipeline.rs, inside Pipeline::with_defaults()
let mut p = Self::with_mdx();
// mdast phase
p.add_mdast_visitor(Box::new(AdmonitionsPlugin::new()));
// hast phase
p.add_hast_visitor(Box::new(HeadingLinksPlugin::new()));
p.add_hast_visitor(Box::new(CodeTitlePlugin::new()));
p.add_hast_visitor(Box::new(MyPlugin)); // <-- new
p.add_hast_visitor(Box::new(MermaidPlugin::new()));
p.add_hast_visitor(Box::new(SyntectPlugin::new(highlighter)));
p
If your plugin is opt-in, do not put it in with_defaults().
Wire it into Pipeline::with_defaults_and_features(), which accepts
a MarkdownFeatures config struct and appends only the visitors whose
flags are set. That is how all 13 features in zfb-md-extras are wired.
ResolveLinksPlugin and StripMdExtensionPlugin are handled separately
because they need a project-specific source map, not just a feature flag.
Ordering matters
Visitor order is load-bearing. The defaults document the full
rationale in Pipeline::with_defaults_and_features’s doc comment, but the
rules that bite most often:
HeadingLinksPluginruns first in the hast phase. Anything that mutates headings later sees the slugifiedidattributes.TocPluginandTocExportPlugindepend on theseidvalues.CodeTitlePluginruns beforeSyntectPlugin. SyntectPlugin replaces the entire<pre>element with aHastNode::RawHTML fragment; once that happens, thedata-metaattribute that carriestitle="…"is no longer reachable as structured AST.MermaidPluginruns beforeSyntectPlugin. MermaidPlugin flags mermaid code blocks withdata-mermaid="true"; SyntectPlugin uses that flag to skip them rather than syntax-highlighting the diagram source.CodeEnrichmentPluginruns afterSyntectPlugin. It post-processes the per-line<span class="line">structure that syntect emits; it cannot run before syntect produces those spans.ImageDimensionsPluginruns in the hast phase beforeSyntectPlugin. It only touches<img>elements and is order-independent relative to heading / code-block visitors.GithubAlertsPluginruns in the mdast phase, before the mdast → hast conversion. It rewrites blockquote nodes; theAdmonitionsPlugin(which also runs in the mdast phase) reads the results independently.TranscludePluginruns first in the mdast phase. Included content must be merged into the AST before any other visitor sees it.
When inserting a new plugin, ask: do I need to see element shapes
that a later plugin will erase? Run before that plugin. Do I need
the results of an earlier plugin’s rewrite (a generated id, a
synthesised JSX element)? Run after it.
Adding genuinely new syntax
Some Markdown constructs that markdown-rs parses are currently
dropped by the mdast → hast converter. Look at mdast_to_hast
in crates/:
// Unhandled: degrade to empty Raw so we never crash on
// unsupported input. Tables, footnotes, definitions, math,
// reference links/images, ESM, frontmatter, etc. fall here.
_ => HastNode::Raw(String::new()),
If you want zfb to surface tables, footnotes, definitions, math, reference-style links, or anything else that lives in the catch-all, two changes are needed:
- Add a match arm to
mdast_to_hastthat turns the mdast variant into the rightHastNode::Element(orRawfor passthrough JSX/HTML). Mirror the existing arms — handle children withconvert_children, build attributes asVec<(String, String)>. - Possibly toggle
markdown::ParseOptionsif the construct needs an extension flag. The currentPipeline::with_mdx()usesmarkdown::ParseOptions::mdx(); you may need a customParseOptionswith additionalconstructs.*fields enabled. Check themarkdowncrate’s docs for the exact flag.
Tests for converter changes belong in pipeline.rs’s own
#[cfg(test)] mod tests {} block (the file already covers headings,
code blocks, links, images, lists, blockquotes, MDX JSX) plus a
matching round-trip case in crates/.
What about runtime / userland plugins?
There is no plugin loader today. Every visitor compiles into the
binary. The closest the user-facing config (zfb.config) gets to
plugins is a plugins: [] field reserved for future use; it is not
yet wired into the build pipeline.
For now the practical extension model is: add the visitor in-tree.
The visitor traits are stable across the workspace, so a feature
written as a fresh plugins/ file rarely needs follow-up changes
when the rest of the codebase moves.
See also
- Markdown Features — full catalog of Core and Opt-in features, with per-feature ordering notes.
- Custom Directives — author-facing
story for
:::name/::name/:namesyntax, no Rust required. - Customizing Markdown — what the Markdown rendering pipeline looks like from a content-collection consumer’s perspective.
crates/—zfb- content/ src/ pipeline. rs Pipeline,MdastVisitor/HastVisitortraits, and thePipeline::with_defaults()ordering rationale in its doc comment.crates/— small, statelesszfb- content/ src/ plugins/ code_ title. rs HastVisitorexample.crates/— statefulzfb- content/ src/ plugins/ heading_ links. rs HastVisitor(per-document slug counter) example.crates/— façade overzfb- content/ src/ plugins/ admonitions. rs DirectiveRegistry, anMdastVisitorexample.