Writing a Zenzic Adapter
This guide explains how to create a third-party adapter that teaches Zenzic to understand your documentation engine's project layout, navigation structure, and i18n conventions — without modifying Zenzic itself.
What Is an Adapter
An adapter is a Python class that extends the BaseAdapter abstract base class
(src/zenzic/core/adapters/_base.py). Zenzic's
scanner, orphan detector, and link validator talk exclusively to this interface —
they never import or call engine-specific code directly.
An adapter answers questions for each docs tree through a single API surface:
Metadata-Driven Routing
| Method | Question |
|---|---|
get_route_info(rel) | What is the canonical URL, route status, slug, and proxy flag for this source file? Returns a RouteMetadata instance. |
Common Methods
| Method | Question |
|---|---|
is_locale_dir(part) | Is this top-level directory a non-default locale? |
resolve_asset(missing_abs, docs_root) | Does a default-locale fallback exist for this missing asset? |
resolve_anchor(resolved_file, anchor, anchors_cache, docs_root) | Should this anchor miss be suppressed because the anchor exists in the default-locale equivalent? |
is_shadow_of_nav_page(rel, nav_paths) | Is this file a locale mirror of a nav-listed page? |
get_ignored_patterns() | Which filename globs should the orphan check skip? |
get_nav_paths() | Which .md paths are listed in this engine's nav config? |
has_engine_config() | Was a build-engine config file found on disk? (Controls orphan check activation.) |
provides_index(directory_path) | Does this directory have an engine-provided landing page? (Controls MISSING_DIRECTORY_INDEX emission.) |
Step 1 — Create the Adapter Class
# my_engine_adapter/adapter.py
from __future__ import annotations
from pathlib import Path
from typing import Any
from zenzic.core.adapters import RouteMetadata
from zenzic.core.adapters._base import BaseAdapter
from zenzic.models.vsm import RouteStatus
class MyEngineAdapter(BaseAdapter):
"""Adapter for MyEngine documentation projects."""
def __init__(
self,
config: dict[str, Any],
docs_root: Path,
) -> None:
self._docs_root = docs_root
self._config = config
# Extract whatever your engine's config format provides.
self._nav_paths: frozenset[str] = self._parse_nav()
# ── BaseAdapter protocol ───────────────────────────────────────────────
def is_locale_dir(self, part: str) -> bool:
"""Return True when *part* is a non-default locale directory.
If your engine does not support i18n, always return False.
"""
locales: list[str] = self._config.get("locales", [])
return part in locales
def resolve_asset(self, missing_abs: Path, docs_root: Path) -> Path | None:
"""Return the default-locale fallback path for a missing locale asset.
If your engine does not support i18n asset fallback, always return None.
"""
return None
def resolve_anchor(
self,
resolved_file: Path,
anchor: str,
anchors_cache: dict[Path, set[str]],
docs_root: Path,
) -> bool:
"""Return True if an anchor miss on a locale file should be suppressed.
Called when a link points to a heading anchor that exists in the
default-locale file but not in the locale translation (because
headings are translated). Return True to suppress the false positive.
If your engine does not support i18n, always return False.
"""
return False
def has_engine_config(self) -> bool:
"""Return True when a build-engine config was found and loaded.
When False, the orphan check is skipped — with no nav information
there is no reference set to compare the file list against.
Return True if your adapter successfully loaded a config file.
Return False only if no engine config exists (bare/standalone mode).
"""
return bool(self._config)
def is_shadow_of_nav_page(self, rel: Path, nav_paths: frozenset[str]) -> bool:
"""Return True when *rel* is a locale mirror of a nav-listed page.
Example: docs/fr/guide/index.md shadows guide/index.md.
If your engine does not support i18n, always return False.
"""
if not rel.parts or not self.is_locale_dir(rel.parts[0]):
return False
default_rel = Path(*rel.parts[1:]).as_posix()
return default_rel in nav_paths
def get_ignored_patterns(self) -> set[str]:
"""Return glob patterns for files the orphan check should skip.
For suffix-mode i18n plugins, return patterns like {'*.fr.md', '*.it.md'}.
"""
return set()
def get_nav_paths(self) -> frozenset[str]:
"""Return the set of .md paths listed in the engine's nav, relative to docs_root."""
return self._nav_paths
def get_metadata_files(self) -> frozenset[str]:
"""Return engine-owned metadata files to ignore in findings."""
return frozenset({"myengine.toml"})
def provides_index(self, directory_path: Path) -> bool:
"""Return whether this engine serves an index page for the directory."""
index_rel = (directory_path / "index.md").as_posix().lstrip("/")
return index_rel in self._nav_paths
def get_extra_content_roots(self, repo_root: Path) -> list[Path]:
"""Return additional markdown roots outside docs_root."""
return []
def get_locale_source_roots(self, repo_root: Path) -> list[tuple[Path, str]]:
"""Return locale roots as (root_path, locale_label) tuples."""
return []
def get_absolute_url_prefixes(self, repo_root: Path | None = None) -> list[str]:
"""Return project-owned absolute URL prefixes."""
return []
# ── Metadata-Driven Routing ────────────────────────────────────────────
def get_route_info(self, rel: Path) -> RouteMetadata:
"""Return unified routing metadata for a source file.
The VSM builder calls this to construct RouteMetadata for every
source file in a single pass.
"""
posix = rel.as_posix()
# Determine reachability from nav config.
if posix in self._nav_paths:
status: RouteStatus = "REACHABLE"
else:
status = "ORPHAN_BUT_EXISTING"
# Compute canonical URL (adjust to your engine's routing rules).
stem = rel.with_suffix("").as_posix()
canonical_url = f"/{stem}/"
return RouteMetadata(
canonical_url=canonical_url,
status=status,
)
# ── Private helpers ────────────────────────────────────────────────────
def _parse_nav(self) -> frozenset[str]:
nav = self._config.get("nav", [])
paths: set[str] = set()
for entry in nav:
if isinstance(entry, str) and entry.endswith(".md"):
paths.add(entry.lstrip("/"))
return frozenset(paths)
Step 2 — Register via Entry Points
Zenzic discovers adapters through the zenzic.adapters entry-point group.
Register your adapter in your package's pyproject.toml:
[project.entry-points."zenzic.adapters"]
myengine = "my_engine_adapter.adapter:MyEngineAdapter"
The key (left of =) becomes the engine name users pass to --engine or
set as engine in .zenzic.toml:
# In the user's .zenzic.toml
[build_context]
engine = "myengine"
Step 3 — Implement the Factory Hook (Optional)
By default, Zenzic instantiates your adapter by calling:
adapter_class(context, docs_root)
where context is a BuildContext instance.
If your adapter needs repository-aware config loading, implement
from_repo(context, docs_root, repo_root) and load engine config there.
If your adapter needs a different constructor signature, implement a
from_repo(context, docs_root, repo_root) classmethod and Zenzic will prefer it:
@classmethod
def from_repo(
cls,
context: "BuildContext",
docs_root: Path,
repo_root: Path,
) -> "MyEngineAdapter":
config_path = repo_root / "myengine.toml"
config = {}
if config_path.exists():
import tomllib
with config_path.open("rb") as f:
config = tomllib.load(f)
return cls(config, docs_root)
Step 4 — Validate with Zenzic
After installing your package (uv pip install -e . or pip install -e .),
verify the adapter is discovered:
# List all installed adapters
zenzic check orphans --engine myengine --help
# Run against a real docs tree
zenzic check orphans --engine myengine
zenzic check all --engine myengine
Step 5 — Custom Rules Are Engine-Independent
[[custom_rules]] in .zenzic.toml run on raw Markdown source and are
completely decoupled from the adapter layer. A rule that searches for DRAFT
will fire identically whether the adapter is MkDocs, Zensical, or your own
engine. No extra work is required to make custom rules compatible with a new
adapter.
Step 6 — Declare Link-Scheme Bypasses (Optional)
If your engine uses a non-standard URI scheme for internal links, implement
get_link_scheme_bypasses() to tell the Core which scheme names to exempt from
the Z105 absolute-path check and the unknown-scheme error (Rule R21 — Protocol
Sovereignty):
def get_link_scheme_bypasses(self) -> frozenset[str]:
"""Return URI scheme names this engine uses legitimately.
The validator adds ``<scheme>:`` to its skip list for each returned name,
suppressing both the unknown-scheme warning and the Z105 absolute-path check
for URLs that use that scheme.
Return ``frozenset()`` if your engine has no special link-scheme bypass.
"""
return frozenset()
Most engines return frozenset(). The built-in DocusaurusAdapter returns
frozenset({"pathname"}) because Docusaurus uses pathname:/// links for
static-asset references that bypass the React router — the leading / in the
path component is a URI convention artifact, not a server-absolute path.
The Core never hardcodes engine names. Engine-specific behaviour is declared in
the adapter and queried by the Core via this method. Adding a new adapter that
needs a link-scheme bypass requires zero changes to validator.py.
Adapter Contract Guarantees
Your adapter must satisfy these invariants, or Zenzic's scanner may produce incorrect results:
-
get_route_info()must return aRouteMetadatawith acanonical_urlthat starts and ends with
/. -
get_route_info()must setstatusto one ofREACHABLE,ORPHAN_BUT_EXISTING, orIGNORED. Never returnCONFLICT— that status is assigned later by_detect_collisions(). -
get_nav_paths()returns paths relative todocs_root, using forwardslashes, with no leading
/. -
get_nav_paths()returns only.mdfiles (other extensions are ignored bythe orphan checker).
-
is_locale_dir()must returnFalsefor the default locale. Onlynon-default locale directories should return
True. -
All methods must be pure: same inputs always produce the same outputs.
No I/O, no global-state mutation.
-
resolve_asset()must never raise — returnNoneon any failure. -
resolve_anchor()must never raise — returnFalseon any failure.The
anchors_cacheargument is read-only; do not mutate it. -
has_engine_config()must never raise — returnFalseon any failure. -
provides_index(directory_path)is the only method permitted to do I/O.It is called once per directory during the discovery phase — never inside per-link or per-file hot loops — so a single
Path.exists()call is acceptable. ReturnTrueif your engine will generate a landing page for the directory (e.g. viaindex.md,README.md, or a dynamic config entry like_category_.jsonwith"link": {"type": "generated-index"}). Never raise — returnFalseon any I/O failure. -
get_link_scheme_bypasses()must return afrozenset[str]of scheme names(without the trailing colon) — never
None, never raise. Returnfrozenset()if your engine has no special link-scheme bypass requirement.
Testing Your Adapter
Use zenzic.core.adapters.BaseAdapter as the typing target in your tests to
verify protocol compliance:
from zenzic.core.adapters import BaseAdapter
from my_engine_adapter.adapter import MyEngineAdapter
def test_satisfies_protocol() -> None:
adapter = MyEngineAdapter(config={}, docs_root=Path("/tmp/docs"))
assert isinstance(adapter, BaseAdapter)
def test_nav_paths_relative() -> None:
adapter = MyEngineAdapter(
config={"nav": ["index.md", "guide/setup.md"]},
docs_root=Path("/tmp/docs"),
)
paths = adapter.get_nav_paths()
assert "index.md" in paths
assert "guide/setup.md" in paths
assert all(not p.startswith("/") for p in paths)
Concrete Implementation Examples: Docusaurus vs. Standalone
To see how these adapter methods are implemented in practice, here is a comparison between the full engine-aware DocusaurusAdapter and the zero-assumption StandaloneAdapter.
provides_index() — Does this directory have a landing page
The Core calls provides_index(directory_path) once per directory during orphan detection. It answers: "Will the engine generate a browsable index for this directory, so that files inside it are not structurally orphaned?"
DocusaurusAdapter.provides_index() — full engine awareness:
def provides_index(self, directory_path: Path) -> bool:
# Physical index files — Docusaurus serves these directly.
index_files = ("index.md", "index.mdx", "README.md", "README.mdx")
if any((directory_path / f).exists() for f in index_files):
return True
# _category_.json with "generated-index" link — Docusaurus auto-generates
# a category landing page even without a physical index file.
category_json = directory_path / "_category_.json"
if category_json.exists():
try:
import json as _json
data = _json.loads(category_json.read_text(encoding="utf-8"))
link = data.get("link", {})
return isinstance(link, dict) and link.get("type") == "generated-index"
except Exception:
return True # conservative: assume it provides an index
return False
StandaloneAdapter.provides_index() — zero engine assumptions:
def provides_index(self, directory_path: Path) -> bool:
# No engine config — only a plain index.md signals a landing page.
return (directory_path / "index.md").exists()
Key difference: DocusaurusAdapter knows about _category_.json and README.mdx because those are Docusaurus conventions. StandaloneAdapter makes no assumptions — it recognises only the universal index.md convention.
get_nav_paths() — What files are discoverable
get_nav_paths() returns the set of file paths reachable via the site's navigation UI. A file absent from this set is a candidate for Z402 (ORPHAN_BUT_EXISTING).
DocusaurusAdapter.get_nav_paths() — three-source aggregation:
def get_nav_paths(self) -> frozenset[str]:
if self._sidebar_path is not None:
sidebar_paths = _parse_sidebars(self._sidebar_path, self._docs_root)
if sidebar_paths is not None:
# Explicit sidebar: merge with navbar paths.
# A file is REACHABLE if it appears in the sidebar OR the navbar.
return sidebar_paths | self._navbar_paths
# Autogenerated or no sidebar: all files are already REACHABLE.
return frozenset()
self._navbar_paths is populated by _parse_config_navigation() from docusaurus.config.* — it extracts to: URL paths and docId: attributes from navbar and footer items. A file linked only in the footer is still considered discoverable (UX-Discoverability Law, Rule R21).
StandaloneAdapter.get_nav_paths() — intentionally empty:
def get_nav_paths(self) -> frozenset[str]:
"""Empty frozenset — no engine config means no declared nav."""
return frozenset()
When get_nav_paths() returns an empty frozenset, get_route_info() treats all files as REACHABLE. This is intentional: in Standalone mode there is no navigation contract, so orphan detection (Z402) is disabled.
_classify_route() — Internal route classification
_classify_route(rel, nav_paths) is a private helper that get_route_info() calls internally to map a source file path to its route status.
DocusaurusAdapter._classify_route() — four classification rules:
def _classify_route(self, rel: Path, nav_paths: frozenset[str]) -> RouteStatus:
# Rule 1: Private/meta files (e.g. _category_.json) → IGNORED
non_sentinel_parts = [p for p in rel.parts if p != "_version_"]
if any(part.startswith("_") for part in non_sentinel_parts):
return "IGNORED"
# Version Ghost Routes: files under _version_/<label>/ are always REACHABLE.
if len(rel.parts) >= 2 and rel.parts[0] == "_version_":
return "REACHABLE"
# Rule 2: No explicit nav → autogenerated sidebar → all REACHABLE
if not nav_paths:
return "REACHABLE"
# Rule 3: Explicit nav match (sidebar or navbar)
if rel.as_posix() in nav_paths:
return "REACHABLE"
# Locale shadows inherit nav membership
if self.is_shadow_of_nav_page(rel, nav_paths):
return "REACHABLE"
# Ghost Routes: locale entry points (e.g. it/index.mdx)
if rel.name in ("index.md", "index.mdx") and len(rel.parts) == 2:
if rel.parts[0] in self._locale_dirs:
return "REACHABLE"
# Rule 4: File exists but is not discoverable via any UI entry point
return "ORPHAN_BUT_EXISTING"
StandaloneAdapter.get_route_info() — always reachable (no _classify_route() helper):
def get_route_info(self, rel: Path) -> RouteMetadata:
"""Return route metadata derived purely from the filesystem.
StandaloneAdapter has no engine config, no nav, no slug support.
Every file is ``REACHABLE`` with a filesystem-derived URL.
"""
from zenzic.core.adapters._base import RouteMetadata
return RouteMetadata(
canonical_url=self._map_url(rel),
status="REACHABLE",
)
The contrast is stark. DocusaurusAdapter implements a four-rule priority cascade in _classify_route(), which get_route_info() calls internally. StandaloneAdapter has no _classify_route() helper — get_route_info() returns REACHABLE directly because with no engine there is no navigation contract.
get_link_scheme_bypasses() — Engine-specific URI schemes
Rule R21 (Protocol Sovereignty) mandates that the Core never hardcodes engine names in validation logic. Engine-specific URI schemes are declared by the adapter and queried by the Core.
DocusaurusAdapter.get_link_scheme_bypasses():
def get_link_scheme_bypasses(self) -> frozenset[str]:
# Docusaurus uses pathname:/// links to reference static/ files that
# bypass the React router. The leading / is a URI convention artifact,
# not a server-absolute path — suppress Z105 for these links.
return frozenset({"pathname"})
StandaloneAdapter.get_link_scheme_bypasses():
def get_link_scheme_bypasses(self) -> frozenset[str]:
"""Standalone projects have no engine-specific link-scheme bypass."""
return frozenset()
When a link like pathname:///assets/brand.html is encountered in a Docusaurus project, the Core checks adapter.get_link_scheme_bypasses(). It finds "pathname" in the returned set and skips the Z105 (absolute path) check. In a Standalone project, the same link triggers Z105 — correctly, because pathname:/// is a Docusaurus-specific escape hatch with no meaning in a generic Markdown project.
Connect adapter code to deployment truth:
-
Register engine identity in project configuration via
[build_context] engine -
Validate adapter behavior under strict Zenzic policy:
zenzic check all --engine myengine --strict. For run controls, see CLI Commands: Global flags. -
If your engine generates synthetic locale routes, explicitly map Ghost Route
expectations against the VSM reference: Core Mechanics — VSM.