==================== Performance Playbook ==================== Large File Guidance =================== - Prefer ``--streaming`` for high-row-count CSV, JSONL, and parquet inputs. - Start with ``--chunk-size 500`` for low-memory runners and increase to ``1000`` or ``2000`` on developer machines. - Reuse bundled templates through the registry instead of repeatedly resolving custom paths. - Use ``scripts/benchmark_large_batches.py`` to measure the current branch on representative data before changing chunk sizes. CPU And Memory Trade-Offs ========================= - Smaller chunks reduce peak memory but increase file count. - Larger chunks reduce file count but increase render and validation time per output file. - Validation dominates for complex schemas, so benchmark end-to-end rather than just template rendering.