Open-source code intelligence for AI agents

Stop paying for tokens your agents never needed

Your coding agents read entire files to find one line. greplm indexes your codebase once and returns only what matters — same answers, a fraction of the cost.

Get started See the numbers

Up to 99% fewer tokens
74× faster
MCP-native
Fully offline

Same query. Different cost. "Where is SegmentWriter used?"

grep + read files 12 files

22,700 tokens

−97.9% same answer

greplm search 12 files

474 tokens

Get started

From zero to indexed in one session

No signup. No cloud. Runs entirely on your hardware.

greplm — launch sequence

macOS
Linux
Windows

Drop-in for your agent

Wired into the tools your team already uses

greplm speaks MCP, so it plugs into Cursor, Claude Code, Copilot, and any MCP-aware client — then teaches the agent to reach for it instead of grep.

Detected & configured

Cursor .cursor/rules
Claude Code CLAUDE.md
GitHub Copilot .github/
Any MCP client mcp config

Speaks MCP — works with any MCP-aware client.

three commands · zero config

$greplm setup Index your repo and warm the daemon — auto-detects agent files
$greplm mcp config Print ready-to-paste MCP JSON with resolved paths
$greplm agent add Teach your editor to prefer greplm over grep

Auto-detects .cursor/, .claude/, and .github/, plus memory files like CLAUDE.md and AGENTS.md — falling back gracefully when nothing is found.

Beyond grep

It answers the questions an agent asks before it edits

One warm index walks the call graph, resolves typed go-to-definition, matches AST patterns, and assembles exactly the code a task needs — on a token budget.

$greplm callers references --limit 3 Who calls this function

cmd_refs  → references   main.rs:896:17dispatch  → references   daemon.rs:263:32

$greplm impact add_doc --depth 2 Blast radius via the reverse call graph

d0  function  add_doc            segment.rs:132-165d1  function  index_full         indexer.rs:187-262d1  function  index_incremental  indexer.rs:265-454

$greplm ast 'fn $NAME() {}' --lang rust Structural search regex can’t express

agent.rs:86-88   fn dest(&self, …) -> PathBuf   NAME=dest

$greplm pack "how incremental indexing works" --budget 4000 Task context pack, budgeted

# 15 items · ~3489 / 4000 tokens## function index_incremental (match)  indexer.rs:265-454  [17.9]

searchsymbolscallerscalleesimpactdefrefsastpackblamehistorychanged

Business impact

Your agents are burning budget on re-reads

Every search that dumps whole files into context is a tax on your API bill and your team’s velocity. greplm returns jump-ready answers — so sessions stay sharp and spend stays low.

One agent session · 8 codebase searches 200k context window

Without greplm grep + read files

0 context cap

181,600 tokens

22,700 per search × 8

Context nearly exhausted — session ends early

With greplm greplm

0 context cap

3,792 tokens

474 per search × 8

Room for dozens more searches in the same session

99%

Fewer tokens per search

Same files returned, compact output

74×

Faster warm queries

Milliseconds, not full-tree scans

100%

Recall on benchmarks

Same results as grep-and-read

Data leaves your machine

Fully offline · no telemetry

Languages

14 languages. One index.

Symbols, call graphs, and AST search across your entire stack — from kernel C to Flutter apps.

14 languages indexed

One warm index powers every query — symbols, callers, blast radius, and AST patterns without re-scanning the tree.

.symbols.callers.impact.ast.refs

Fully offline — nothing leaves your machine. Index once, query forever.

Benchmarks

Proven on production-scale codebases

Same files as grep-and-read — for a fraction of the tokens, answered from a warm index in milliseconds while ripgrep re-scans the whole tree.

greplm benchmark results on large open-source projects
Project	Files	Index once	Search saved	Pack saved	Recall	Warm query vs ripgrep
React JS / TS	6,723	2.0s	99.7%	97.4%	100%	9 ms 12× faster
Odoo 18 Py / JS / XML	41,142	19.8s	99.9%	99.3%	100%	33 ms 33× faster
Linux kernel C	93,362	66.5s	99.9%	98.4%	100%	31 ms 75× faster

Content search across all three suites

218.7M tokens 280.5k tokens 99.9% fewer

tokens ≈ characters / 4, applied identically to both engines; ripgrep runs in literal mode so both look for the same string. Warm query = median latency against the always-on daemon.

Full methodology & per-query tables →