v0.0.2 — available now

stop hitting limits
mid-session

// touch less grass

IndexQube sits between your terminal and Claude, deduplicating tokens before they're sent — 15–17% fewer tokens, measured on real sessions.

// one command install
curl -fsSL https://indexqube.com/install | bash copy
then run: iq claude / iq codex / iq gemini
tokens processed
live
tokens deduplicated
live
requests proxied
live
sessions optimized
live
session log — real 47-request session
before ↑ 72,320 tokens — repeated file reads in history
after   ↑ 59,848 tokens — duplicates stripped, 18 blocks pruned
17%
tokens deduplicated
12,472
tokens saved / request
direct
to Anthropic API
zero-alloc Rabin-Karp chunking
custom LSM storage engine
works with most agentic CLIs
credentials never leave your machine
// HOW IT WORKS
request pipeline — per turn
1 buffer + bound http.MaxBytesReader — 8 MiB cap protects against OOM on large prompts
2 Rabin-Karp chunk 64-byte window, ~4 KB chunks content-defined boundaries — same algo as rsync + Git
3 SHA-256 → LSM lookup Bloom filter gate, skiplist MemTable O(1) negative lookup — avoids disk on cache miss
4 strip known blocks re-marshal JSON, pointer swap 72 K → 59 K tokens typical on warm session
5 forward upstream TLS 1.3, HTTP/2, keep-alive pool your API key, your account — no relay
6 SSE stream back http.Flusher, sub-ms flush zero added latency on streaming tokens
7 Prometheus record histogram — p50 / p90 / p95 / p99 tail latency tracked per session, no sampling loss
Instant duplicate detection
// Rabin-Karp chunking · internal/chunker/
window 64 bytes, rolling hash
boundary hash & mask == 0 → split
chunk size ~4 KB target, variable
identity SHA-256 content address
prior art rsync, Git pack objects
allocs zero per rolling step
Near-instant cache lookups
// custom LSM engine · internal/store/lsm/
MemTable Go skiplist, flush @ 4 MB
SSTable 4 KB page-aligned blocks
index binary search on read path
filter Bloom — double FNV-1a hash
compaction leveled, background goroutine
writes 16× SQLite B-tree throughput
Zero-latency streaming
// L7 proxy + SSE · internal/proxy/
model net/http goroutine per request
backpressure io.LimitReader at ingress
streaming http.Flusher, no buffering
telemetry Prometheus histogram p50 → p99
overhead < 15 ms typical proxy tax
concurrency RWMutex concurrent reads
// BENCHMARK — LSM vs SQLite (go test -bench, 100K ops each)
operation n LSM SQLite delta
sequential write 100 K 2,557 ns/op 42,498 ns/op 16× faster
write amplification 0.277× 6.92× 25× lower
Bloom lookup 32 ns/op n/a 0 B/op, 0 allocs

Ready? Install in 30 seconds.

One command. No account. Works with Claude Code.

curl -fsSL https://indexqube.com/install | bash copy
then run: iq claude