Research · 2024 Open source · MIT

Quorum

Six models argue a question. A seventh judges. The truth is somewhere in the disagreement.

PythonOpenRouterDSPy

Quorum is a multi-model debate harness. You give it a question — usually something with genuine uncertainty, where reasonable people disagree — and it orchestrates a structured debate between six different language models. A seventh model acts as judge, scoring each argument on logical coherence, evidence use, and novelty.

I built it as a research tool for my own essays. When I am writing about a topic where I have a strong prior, Quorum forces me to encounter the strongest versions of the opposing arguments. It is uncomfortably effective at this.

The most useful thing about a multi-model debate is not the conclusion. It is finding out which argument you cannot refute.

How It Works

OpenRouter for model access — debate rounds cycle through Claude, GPT-4, Gemini, Llama, Mistral, and Command R
DSPy for prompt orchestration — each round’s prompts are compiled, not hand-written
Python for the harness — simple, synchronous, easy to debug
Structured output for scoring — the judge model returns JSON with per-argument scores and reasoning

What I Learned

The surprising finding from running hundreds of debates is that model diversity matters more than model quality. A debate between six copies of the best model produces less interesting results than a debate between six different models of varying quality. The disagreements are where the signal lives.

Quorum is open source under MIT. It is rough around the edges but functional. Several of the essays on this site — particularly the sovereign AI thesis — were stress-tested through it before publication.

Rijul · Open source since Nov 2024

Quorum

How It Works

What I Learned

I write to think. You can read along.

I write to think.
You can read along.