Open-source code intelligence for AI agents

Stop paying for tokens your agents never needed

Your coding agents read entire files to find one line. greplm indexes your codebase once and returns only what matters — same answers, a fraction of the cost.

  • Up to 99% fewer tokens
  • 74× faster
  • MCP-native
  • Fully offline
Same query. Different cost. "Where is SegmentWriter used?"
grep + read files 12 files

22,700 tokens

−97.9% same answer
greplm search 12 files

474 tokens

Get started

From zero to indexed in one session

No signup. No cloud. Runs entirely on your hardware.

greplm — launch sequence
  • macOS
  • Linux
  • Windows

Drop-in for your agent

Wired into the tools your team already uses

greplm speaks MCP, so it plugs into Cursor, Claude Code, Copilot, and any MCP-aware client — then teaches the agent to reach for it instead of grep.

Detected & configured

  • Cursor .cursor/rules
  • Claude Code CLAUDE.md
  • GitHub Copilot .github/
  • Any MCP client mcp config

Speaks MCP — works with any MCP-aware client.

three commands · zero config
  1. $greplm setup Index your repo and warm the daemon — auto-detects agent files
  2. $greplm mcp config Print ready-to-paste MCP JSON with resolved paths
  3. $greplm agent add Teach your editor to prefer greplm over grep

Auto-detects .cursor/, .claude/, and .github/, plus memory files like CLAUDE.md and AGENTS.md — falling back gracefully when nothing is found.

Beyond grep

It answers the questions an agent asks before it edits

One warm index walks the call graph, resolves typed go-to-definition, matches AST patterns, and assembles exactly the code a task needs — on a token budget.

$greplm callers references --limit 3 Who calls this function
cmd_refs  → references   main.rs:896:17dispatch  → references   daemon.rs:263:32
$greplm impact add_doc --depth 2 Blast radius via the reverse call graph
d0  function  add_doc            segment.rs:132-165d1  function  index_full         indexer.rs:187-262d1  function  index_incremental  indexer.rs:265-454
$greplm ast 'fn $NAME() {}' --lang rust Structural search regex can’t express
agent.rs:86-88   fn dest(&self, …) -> PathBuf   NAME=dest
$greplm pack "how incremental indexing works" --budget 4000 Task context pack, budgeted
# 15 items · ~3489 / 4000 tokens## function index_incremental (match)  indexer.rs:265-454  [17.9]
searchsymbolscallerscalleesimpactdefrefsastpackblamehistorychanged

Business impact

Your agents are burning budget on re-reads

Every search that dumps whole files into context is a tax on your API bill and your team’s velocity. greplm returns jump-ready answers — so sessions stay sharp and spend stays low.

One agent session · 8 codebase searches 200k context window
Without greplm grep + read files

181,600 tokens

22,700 per search × 8

Context nearly exhausted — session ends early

With greplm greplm

3,792 tokens

474 per search × 8

Room for dozens more searches in the same session

99%

Fewer tokens per search

Same files returned, compact output

74×

Faster warm queries

Milliseconds, not full-tree scans

100%

Recall on benchmarks

Same results as grep-and-read

0

Data leaves your machine

Fully offline · no telemetry

Languages

14 languages. One index.

Symbols, call graphs, and AST search across your entire stack — from kernel C to Flutter apps.

14 languages indexed

One warm index powers every query — symbols, callers, blast radius, and AST patterns without re-scanning the tree.

.symbols.callers.impact.ast.refs

Fully offline — nothing leaves your machine. Index once, query forever.

Benchmarks

Proven on production-scale codebases

Same files as grep-and-read — for a fraction of the tokens, answered from a warm index in milliseconds while ripgrep re-scans the whole tree.

greplm benchmark results on large open-source projects
Project Files Index once Search saved Pack saved Recall Warm query vs ripgrep
React JS / TS 6,723 2.0s 99.7% 97.4% 100%
9 ms 12× faster
Odoo 18 Py / JS / XML 41,142 19.8s 99.9% 99.3% 100%
33 ms 33× faster
Linux kernel C 93,362 66.5s 99.9% 98.4% 100%
31 ms 75× faster

Content search across all three suites

218.7M tokens 280.5k tokens 99.9% fewer

tokens ≈ characters / 4, applied identically to both engines; ripgrep runs in literal mode so both look for the same string. Warm query = median latency against the always-on daemon.

Full methodology & per-query tables →