MAKER¶
MAKER (M aximal Agentic decomposition, first-to-ahead-by-K Error correction, and Red-flagging) is a task-agnostic orchestrator for long-horizon problems. It decomposes work into many small steps; at each step it samples LLM outputs, discards bad ones (red-flagging), and commits only when one parsed answer leads the next-best by k votes (“first-to-ahead-by-k”).
This implementation follows the framework described in Solving a Million-Step LLM Task with Zero Errors (Meyerson et al., 2025) — arXiv:2511.09030.
Import: from swarms.structs.maker import MAKER
When to use MAKER¶
| Use MAKER when… | Consider something else when… |
|---|---|
| You can express the problem as a fixed or conditionally bounded sequence of steps | You need a fixed DAG of different agents (GraphWorkflow) |
| Each step should be a single focused LLM call with statistical agreement | You want multi-agent debate + judge (DebateWithJudge) |
| You care about per-step reliability (voting + validation) over raw speed | You only need one-shot or simple majority across agents (MajorityVoting) |
How it works¶
flowchart TD
T[Task + max_steps] --> S[For each step]
S --> V[Sample votes via Agent.run]
V --> R{Red-flag?}
R -->|invalid / exception| V
R -->|valid| P[parse_response → hashable result]
P --> C{Leader ahead by k?}
C -->|no| V
C -->|yes| U[update_state, append result]
U --> S
- MAD (maximal agentic decomposition) — You run up to
max_stepsiterations; each iteration is one micro-step with a prompt built from the task, optional state, step index, and the previous step’s result (format_prompt). - First-to-ahead-by-k voting — Parsed answers are counted until some candidate’s count is at least
kgreater than every other candidate (do_voting). Optionalrun_parallel_votingbatches the first round of samples with a thread pool. - Red-flagging — Before parsing,
validate_responsecan reject outputs (default rejects empty or overly long text vsmax_tokens).
Constructor parameters¶
| Parameter | Role |
|---|---|
model_name, system_prompt, max_tokens, temperature, temperature_first |
Passed through to per-step Agent instances (first vote often uses temperature_first=0). |
k |
Votes a winner must lead the runner-up by (higher ⇒ more reliable, more cost). |
format_prompt(task, state, step_idx, previous_result) |
Builds the user prompt for the current step. |
parse_response(text) |
Turns raw LLM output into a hashable result for voting (strings, numbers, tuples of primitives, etc.). |
validate_response(text, max_tokens) |
Returns False to discard a sample. |
update_state(state, result, step_idx) |
Fold step output into state (default: unchanged). |
initial_state |
Starting state for run / run_until_condition. |
max_workers |
Thread pool size for run_parallel_voting (default: k). |
max_retries_per_step |
Cap on samples per step before RuntimeError. |
agents |
Optional list of pre-built Agents; votes cycle through this pool instead of creating fresh micro-agents. |
Main methods¶
| Method | Description |
|---|---|
run(task, max_steps) |
Run exactly max_steps voting rounds; returns list of per-step results. |
run_until_condition(task, stop_condition, max_steps=1000) |
Like run, but before each step the loop checks stop_condition(state, results, step_idx); if true, it exits without running another vote for that index. |
run_parallel_voting(task, max_steps) |
Like run but uses parallel sampling for the first batch of votes per step. |
get_statistics() |
Copy of internal counters (samples, votes, red-flags, per-step vote/sample lists). |
reset() |
Clears stats and conversation. |
estimate_cost(total_steps, target_success_probability=0.95) |
Heuristic cost / k guidance from paper-style estimates (uses run statistics when available). |
Minimal example¶
from swarms.structs.maker import MAKER
def format_prompt(task, state, step_idx, previous_result):
prev = f"\nPrevious: {previous_result}" if previous_result is not None else ""
return f"{task}\nStep {step_idx + 1} of the plan. One short line only.{prev}"
def parse_response(response: str) -> str:
return response.strip().splitlines()[0]
def validate_response(response: str, max_tokens: int) -> bool:
if not response.strip():
return False
return len(response) // 4 <= max_tokens # rough token estimate, same idea as default
maker = MAKER(
name="LineByLine",
model_name="gpt-4.1-mini",
system_prompt="Answer in one short line per step.",
format_prompt=format_prompt,
parse_response=parse_response,
validate_response=validate_response,
k=2,
verbose=True,
)
results = maker.run(task="List three benefits of unit tests, one per step.", max_steps=3)
print(results)
Related¶
- Source:
swarms/structs/maker.py(module and class docstrings mirror this behavior). - MajorityVoting — multi-agent loops with a consensus agent, not step-wise first-to-ahead-by-k on a decomposed trajectory.