↑ ↓ navigate

Arcanist

Code generation got cheap.
Shipping it didn’t.

tryarcanist.com

01 / The Change

The software factory
is now obvious.

Give an agent repo access, tools, a sandbox, and a path to open PRs.

One sentence becomes a candidate code change.

Devin, Claude Code, Codex, Cursor. Every team is buying or building the same architecture.

What was a research bet last year is the status quo today. The factory is no longer the differentiator.

Anatomy of the software factory
01

Prompt

A request from Slack, a ticket, or a teammate.

02

Agent

The model plans, edits, and coordinates.

03

Sandbox

A live environment to build and run.

04

Tools

GitHub, CI, logs, docs, runtime.

05

Candidate PR

A change is opened. Trust decides if it ships.

02 / Winners & Losers

The factory writes code.
Arcanist ships it.

Without trust

Engineering teams
become QA teams.

  • More PRs, same merge rate.
  • Senior engineers spend their day reviewing untrusted output.
  • Backlog grows faster than throughput.
  • Async work collapses back to sync.

With trust

Same headcount.
3× merge rate.

  • Engineers approve work that's already been validated end-to-end.
  • Backlog shrinks. Velocity compounds.
  • Cross-functional teams stop being bottlenecked by engineering.
  • Async work stays async.

By the time your senior engineer signs off on the next PR, the team with verified work has shipped three.

03 / AI-Native Levels

Five levels of AI-native engineering.
Arcanist gets you to Level 4.

Level Name What AI does Limitation
L0·L1 Autocomplete & chat Inline completion. Multi-file diffs in the IDE. Can't take action
L2 Synchronous agents Plans and executes while you watch.
Needs your attention ← Most teams
L3 Background agents that can do the work Async PR. Reasons about your code.
Can't verify its own work ← Most teams
L4 Background agents that can check their work Async PR. Reasons about your code, runs your stack end-to-end, ships with evidence. Needs you to coordinate
L5 Agent teams Agents claim, review, and verify each other’s work. Can't coordinate as a team

May 2026 · L4 ships today with Arcanist

04 / The Path

Climb the levels with five capabilities.

Arcanist Capability What it adds What you unlock
Capability 1 Basic agent ops
Static analysis. Reads the repo, opens a PR. The Devin / Claude Code / Codex baseline.
Stuck at Level 3. PRs need manual checkout.
Capability 2 Runs your application
Installs deps, runs unit + e2e tests, opens a browser, exercises the UI. Diagnoses and fixes failures.
Frontend PRs ship verified.
Capability 3 Frontend previews
Preview link of any FE change running against your staging backend.
Reviewers stop interrupting their day.
Capability 4 Per-PR dev deploys
Scaled FE + BE stack spun up per PR. Full e2e of frontend and backend changes from one preview link.
High-complexity PRs ship verified. Stripe Minions tier.
Capability 5 Agentic verification
Agents drive the preview, query e2e, catch regressions before a human ever looks.
Reach Level 5. Any-complexity PR auto-merges.

05 / Memory

Every PR compounds.

Arcanist captures three kinds of context. Every run feeds them. Tribal knowledge becomes codified.

Strategic
Conventions The shape your codebase has settled into. Naming, structure, the patterns your team has converged on without ever writing them down. Learned silently from every PR Arcanist merges.
Tactical
Directives The explicit rules you’d otherwise repeat in every review. “Use apiClient, not raw fetch.” “Run prettier before commit.” These rules can be said once and followed forever.
Gotchas
Don’t make this mistake again The failures your team has already paid for. Webhook idempotency, the missing index, the migration order that broke staging. Captured the moment they happen, so no one trips on the same wire twice.
# payments
EM
Engineer 11:02 AM
@arcanist add Adyen as a payment provider for the new EU checkout.
A
Arcanist APP 11:09 AM
PR open. Pulled three things from memory:
· Convention — payment providers extend BaseProvider (matches Stripe, Braintree)
· Directive — idempotency keys required on webhook routes (ENG-2491)
· Gotcha — signature verification needs raw body, not parsed (incident 2026-03-12)

06 / The Product

Tag @arcanist where you already work.

Describe what you need in plain English. Arcanist runs your full stack, exercises the change, fixes its own failures, and opens a PR with verified artifacts: screenshots, video, test runs.

No new IDE. No new platform. No new tab.

Slack Linear GitHub

The bridge. Coding is the central nervous system. Once your agent knows your company, CX, PM, and marketing get engineering-grade answers without filing a ticket.

# engineering
SP
Engineer 2:14 PM
@arcanist Checkout is throwing 500s on /payments. Pull Sentry, find the root cause, and open a PR.
A
Arcanist APP 2:17 PM
Failure reproduced in dev deploy. Fix verified end to end. Tests pass. Checkout flow recorded.
4 tests added · video of fix · CI green

07 / Build vs. Buy

Build the product.
Buy the factory.

Build it yourself

A side project
that becomes your job.

  • Senior engineering and FDE work in parallel, embedding into every team to learn how they actually ship.
  • Fall a quarter behind the frontier and your engineers quietly switch back to Cursor.
  • Infra, product, and every edge case live on your roadmap forever.
  • Every problem the industry has already solved, you solve again from scratch.

Buy Arcanist

An on-call team
already in your Slack.

  • Capabilities, memory, and infra are already shipped. Verified PRs from week one.
  • An FDE team on the frontier full-time, so your team doesn’t have to be.
  • Edge cases, bugs, capability gaps. Our problem, not yours.
  • Memory compounds across every teammate we onboard.

Every quarter you spend building the factory is a quarter your competitor spent improving their product. The gap doesn’t close.

Where is your
team today?

Let’s get you to L4 and L5.

shivam@tryarcanist.com · josiah@tryarcanist.com