AI-Maintained Knowledge Bases

AI-maintained knowledge bases are structured markdown wikis built and updated by large language model (LLM) agents rather than by human editors. The paradigm was popularized by Andrej Karpathy in April 2026 through a GitHub Gist that introduced the "LLM Wiki" pattern, which frames knowledge management through a compiler metaphor: source documents are pre-processed into an interlinked wiki, and all subsequent reasoning works from this compiled output rather than re-deriving knowledge from scratch on every query.[^c2] In this model, the human curates what enters the system while the LLM handles all maintenance and cross-referencing.[^c1]

The core architecture consists of three layers: an immutable raw/ directory containing original source documents, a wiki/ directory of LLM-generated markdown pages with cross-references, and a schema configuration that governs the LLM's wiki maintenance behavior.[^c4] The system operates through three primary operations — Ingest (processing new sources into the wiki), Query (synthesizing answers from compiled pages), and Lint (health-checking for contradictions, orphans, and broken links). A single ingest operation typically updates between five and fifteen wiki pages as the agent traces implications across the knowledge graph.

Empirical testing of the pattern has demonstrated significant improvements in answer quality. Wiki-assisted answers scored 4.9 out of 5 compared to 4.0 out of 5 for direct LLM queries, while a set of ten source articles (~29,000 words) was processed into 57 interconnected wiki pages in approximately twenty minutes with three contradictions automatically detected.[^c3][^c5] The pattern is distinguished from retrieval-augmented generation (RAG) by its emphasis on knowledge compounding over time, its use of immutable source verification, and its ability to operate with zero infrastructure beyond markdown files and an LLM.

The LLM Wiki concept rapidly spawned a broad ecosystem of implementations and extensions. Within a week of the original gist, community projects added features such as confidence scoring, typed relationship graphs, Ebbinghaus-inspired retention decay, multi-agent governance, and integration with the Model Context Protocol (MCP) for agent tool access.[^c6] Implementations range from minimal, zero-dependency tools to production-grade systems supporting over thirty input formats, hybrid search, contradiction detection, and team collaboration workflows.