Skip to content
← Field notes
AI Distributed Systems

Git as a task database: running AI agents across machines without a server

Hive-Claw uses atomic git push-with-rebase as a distributed task queue so AI agents on different laptops can claim and complete work without a central coordinator.


title: 'Git as a task database: running AI agents across machines without a server' slug: 'multi-machine-ai-agents-hive-claw' date: '2026-05-07' excerpt: 'Hive-Claw uses atomic git push-with-rebase as a distributed task queue so AI agents on different laptops can claim and complete work without a central coordinator.' tags: ['AI', 'Distributed Systems'] readTime: 9#

The Coordination Problem#

When you run AI coding agents on multiple machines — say, three laptops each running a Claude or GPT-4 agent — you need them to divide work without stepping on each other. The obvious solution is a central server with a task queue (Redis, SQS, RabbitMQ). But a central server means infrastructure, authentication, uptime monitoring, and cost. What if the coordination mechanism was already in every developer's toolkit?

Git's atomic push-with-rebase gives you exactly that.

Git as a Distributed Lock#

The key insight is that git push is atomic at the ref level. If two agents try to push to the same branch simultaneously, only one succeeds — the other gets a non-fast-forward rejection. This is not a bug; it is a compare-and-swap (CAS) operation baked into the Git protocol.

Hive-Claw stores task state in a JSON file committed to a shared repository:

{
  "tasks": [
    { "id": "task-001", "status": "pending", "claimedBy": null },
    { "id": "task-002", "status": "in-progress", "claimedBy": "agent-laptop-B" },
    { "id": "task-003", "status": "done", "claimedBy": "agent-laptop-A" }
  ]
}

An agent claiming a task follows this sequence:

  1. git pull --rebase origin main — get latest state
  2. Find a pending task and set claimedBy to the agent's identifier
  3. Write the updated JSON and git commit -m "claim: task-001 by agent-laptop-C"
  4. git push origin main

If step 4 fails because another agent pushed first, the agent catches the rejection, pulls again, and tries the next available task. This retry loop is the entire "queue" implementation.

Conflict Resolution: Two Agents, One Task#

The race condition — two agents both read task-001 as pending and both try to claim it — resolves through push rejection. Agent A wins the push; Agent B gets a non-fast-forward error. On pull-rebase, Agent B sees the file now shows task-001 claimed by Agent A. Agent B's local commit is rebased on top of this, producing a merge conflict in tasks.json.

We resolve this deterministically: Hive-Claw's rebase strategy always takes the incoming (remote) state of tasks.json and reapplies the claiming logic against the updated file. If task-001 is now taken, the agent moves to task-002. The rebase script produces a clean commit that claims the next available task.

# During rebase conflict resolution
git checkout --theirs tasks.json   # accept remote state
node scripts/claim-next-task.js    # re-run claim logic against updated file
git add tasks.json
git rebase --continue

Why Not Redis?#

The comparison is fair. Redis Streams or BullMQ would give you sub-millisecond claim latency vs. Hive-Claw's 1–3 second round-trip through GitHub's API. For tasks that take seconds to hours (code generation, test runs, PR reviews), the difference is irrelevant.

What Hive-Claw gains:

  • Zero infrastructure: agents coordinate through a GitHub repo they already have access to
  • Full audit trail: every claim, completion, and failure is a git commit with a timestamp and author
  • Human-readable state: open tasks.json in any editor to see what every agent is doing
  • Offline-tolerant: agents can work offline and sync when reconnected; no lost messages
  • Free: GitHub private repos are free; no Redis bill

The tradeoff is throughput. Hive-Claw is designed for tens of tasks per hour, not thousands per second. For AI agent workflows — where each task takes minutes — this is the right operating point.

Failure Modes and Recovery#

When an agent crashes mid-task, the task stays in in-progress with a stale claimedBy. Hive-Claw adds a claimedAt timestamp and a configurable taskTimeoutMinutes value. A watchdog agent (running as a cron job) scans for tasks that have been in-progress longer than the timeout and resets them to pending.

{
  "id": "task-007",
  "status": "in-progress",
  "claimedBy": "agent-C",
  "claimedAt": "2026-05-07T03:12:00Z"
}

The watchdog commits a reset with a descriptive message: "reset: task-007 timed out after 30min". This commit appears in git log, giving operators a clear recovery history without any external monitoring system.

Next field note

LangChain pipelines that write developer brand copy from GitHub activity