Sketch-to-figure (TikZ MCP)

The motivation is the gap between a researcher's rough idea for a figure and the fiddly, error-prone work of hand-writing TikZ to get a clean, vector, publication-ready result. The original CLI captures the whole loop. It detects whether the input is direct text, a text file, or an image, then asks the model to produce a detailed specification covering title, layout, elements with shapes and colors, connections with styles, logical groupings, and style notes. From that spec it generates a complete LaTeX document, extracts the code from any markdown fences, and compiles it. Compilation is deliberately constrained, run with a timeout and without shell-escape for safety.

The defining mechanism is the vision-in-the-loop critique. After each compile, the rendered figure is shown back to the model, which returns an approval flag, a numeric score out of ten, and written feedback, all saved to a per-iteration critique file. If the figure is approved or the score meets the configured threshold (default eight out of ten), the loop stops and emits the final PDF; otherwise the feedback feeds the next revision, up to a maximum iteration count. This makes quality control self-driving: the model judges its own output by actually looking at the pixels, not by re-reading its own code, which is the right signal for a visual artifact.

The second form is a remote MCP server that exposes the same engine to Claude Desktop. Here the conversation itself becomes the critique loop: the user chats, the model writes TikZ and calls a compile tool that returns the rendered image inline so it can see and self-correct, an export tool that produces publication-ready SVG and PDF for a chosen revision, and a session-listing tool. Work is organized into per-user sessions with zero-padded revision numbers, each saved as paired tex, pdf, and png files, backed by SQLite. Access is gated by OAuth with Google as the provider, the server issuing its own opaque tokens, and usage is bounded by a weekly session quota.

As honest context, the output quality is only as good as the underlying LaTeX toolchain and the model's TikZ fluency; the system depends on a working pdflatex and image-conversion tools on the host, and complex figures can still need several revisions or hit the iteration cap without converging. The two entry points reuse the same compile-and-render core, but the autonomous scoring loop is the CLI's distinctive contribution, while the MCP server trades that automated loop for an interactive, human-in-the-conversation one.