Teams Preserve Projects Through Model Swaps

Projects that embed large language models break when the model changes unless the team preserves the surrounding processes. Christopher Meiklejohn details five model swaps across three weeks, showing that the invariant that kept the project running was not the model but the CI, permit and proof workflow. Enforcing a permit file in .caucus/permits/ and a proof file in .caucus/proofs/ produced reproducible, auditable outputs that kept merges safe as models varied. Different models behaved like different junior engineers: some deliver complete, gated work upfront; others require the gate to force compliance. The practical lesson is to invest in tribe-level tooling and process, not model-specific hacks.
What happened - Christopher Meiklejohn ran five model configurations across three weeks, including `Opus 4.6`, `Opus 4.6` with the 1M context window, `GPT Codex 5.3`, `Composer 2`, and `Opus 4.7`. He observed that the only durable part of the system was the human and tooling layer that enforced discipline, not the models themselves. The gate in CI kept the project healthy while the model changed frequently.
Technical details - Meiklejohn implemented a permit-and-proof pattern enforced by CI. Each PR had to include a permit file in .caucus/permits/ declaring scope and risk, plus a proof file in .caucus/proofs/ recording outputs of allowlisted test commands. The CI blocked merges until both files were present and fresh. Behavior differences were stark: GPT Codex 5.3 produced the feature, the permit, and the proof in one or two commits and landed with the gate green. Composer 2 often produced the feature commit alone and required a sequence of chore commits to add and refresh proofs, producing as many as twenty-three commits on a branch before merging. Neither approach was objectively wrong, but the gate changed the tradeoffs.
Observed configurations - The experiment covered multiple agent-model pairings and revealed behavioral variance: - Opus 4.6 and Opus 4.6 with 1M context - GPT Codex 5.3 - Composer 2 agents - Opus 4.7
Context and significance - This is an operational lesson in ModelOps and developer experience. Treating models as interchangeable contributors, like junior engineers on short contracts, shifts responsibility to durable processes: reproducible proofs, auditable permits, and CI gates. Those processes preserve code quality across model drift, quota exhaustion, or regressions. The pattern mirrors established practices in safety-critical engineering where the toolchain, not a specific component, guarantees system properties.
What to watch - Expect more teams to codify permit-and-proof patterns and for CI tooling to add first-class primitives for model-generated proofs and signed attestations. Measure success by reduced rework, stable CI pass rates, and fewer model-specific hacks.
Scoring Rationale
This is a practical operational lesson for teams deploying LLM agents. It does not introduce a new model or algorithm, but it provides a reproducible pattern with immediate value for engineering workflows. Freshness adjustment applied.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems


