Quickstart
RedHop has the same API in three languages — pick your tab; the choice follows you down the page.
Install
Section titled “Install”pip install redhopnpm install redhopcargo add redhop --features files,semanticOne package, no services, no vector DB. Document parsing (PDF/DOCX/PPTX/XLSX) and the optional semantic model are built in.
Reason over a document
Section titled “Reason over a document”Point RedHop at a file. It parses, chunks, and indexes it, then hands you back just the context your question needs — which you give to any LLM:
import redhopfrom openai import OpenAI
doc = redhop.Document.from_file("contract.pdf") # parse + chunk + indexquestion = "What is the governing law of this contract?"ctx = doc.context(question)
# Hand ctx.text() to any provider — no lock-in.resp = OpenAI().chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": f"Use only this context:\n\n{ctx.text()}\n\nQ: {question}"}],)print(resp.choices[0].message.content)print(ctx.report) # the Decision Report ↓const { Document } = require("redhop");const OpenAI = require("openai");
const doc = Document.fromFile("contract.pdf"); // parse + chunk + indexconst question = "What is the governing law of this contract?";const ctx = doc.context(question);
// Hand ctx.text to any provider — no lock-in.const resp = await new OpenAI().chat.completions.create({ model: "gpt-4o-mini", messages: [{ role: "user", content: `Use only this context:\n\n${ctx.text}\n\nQ: ${question}` }],});console.log(resp.choices[0].message.content);console.log(ctx.report.rendered); // the Decision Report ↓let mut doc = redhop::read_file("contract.pdf")?; // parse + chunk + indexlet question = "What is the governing law of this contract?";let ctx = doc.context(question)?;
// Hand ctx.text() to any LLM (async-openai, reqwest, …) — no lock-in.let _prompt = format!("Use only this context:\n\n{}\n\nQ: {question}", ctx.text());println!("{}", ctx.report.render(None)); // the Decision Report ↓The Decision Report
Section titled “The Decision Report”Every call explains itself — including when RedHop deliberately does nothing:
RedHop Decision Report══════════════════════
Decision: Auto → passthrough (left the context intact) Why: - input is small: 91 tokens ≤ 1500 gate - under headroom, pruning is measured to be wash-to-harmful - intervention predicted to add no signal density here Result: - kept all retrieved chunks — full evidence preserved - avoided unnecessary intervention
Economics retrieved / final tokens, savings, density, retained evidenceDiagnostics chunks, distractor ratio, second-hop rescues, …The decision is also available programmatically:
ctx.report.auto_decision # "passthrough" | "prune"ctx.report.total_tokensctx.report.retained_evidence_ratioctx.report.autoDecision // "passthrough" | "prune"ctx.report.totalTokensctx.report.retainedEvidenceRatioctx.report.auto_decision(); // AutoDecision::Passthrough | ::Prunectx.report.total_tokens;ctx.report.retained_evidence_ratio;Cite the evidence
Section titled “Cite the evidence”Every selected chunk remembers where it came from, so you can show the model’s evidence trail, not just paste it:
for c in ctx.citations: print(c["source"], c["page"]) # e.g. contract.pdf 3 → "from contract.pdf, p.3"for (const c of ctx.citations) { console.log(c.source, c.page); // e.g. contract.pdf 3 → "from contract.pdf, p.3"}for c in &ctx.chunks { // source + page/heading/line live on each chunk's metadata println!("{} {:?}", c.source, c.metadata.get("page"));}Other ways to get content in
Section titled “Other ways to get content in”Loading a file is the quickest start, but it’s one of several on-ramps — all return a
Document:
# Text you already have (your own parser/OCR, a DB field).doc = redhop.Document.from_text(open("notes.md").read())# Already chunked it yourself.doc = redhop.Document.from_chunks(["clause one …", "clause two …"])# A whole folder — one combined index, citations per file.doc = redhop.Document.from_folder("./docs")# Bytes from S3 / Azure / GCS / HTTP.doc = redhop.Document.from_bytes(s3_object_bytes, source="contract.pdf")// Text you already have (your own parser/OCR, a DB field).let doc = Document.fromText(fs.readFileSync("notes.md", "utf8"));// Already chunked it yourself.doc = Document.fromChunks(["clause one …", "clause two …"]);// A whole folder — one combined index, citations per file.doc = Document.fromFolder("./docs");// Bytes from S3 / Azure / GCS / HTTP.doc = Document.fromBytes(buffer, "contract.pdf");// Text you already have (your own parser/OCR, a DB field).let doc = redhop::Document::from_text("notes", text)?;// A whole folder — one combined index, citations per file.let doc = redhop::read_folder("./docs")?;// Bytes from S3 / Azure / GCS / HTTP.let doc = redhop::read_bytes(&bytes, "contract.pdf")?;See all the loaders → — including a persistent, incremental on-disk index over thousands of files.
Knobs (sane defaults, tune when needed)
Section titled “Knobs (sane defaults, tune when needed)”doc = redhop.Document.from_file( "contract.pdf", chunk_size=128, # index-time: how the doc is split strategy="auto", # size-gated: prune only under dilution)ctx = doc.context(query, budget=2000) # query-time: vary freely, no re-indexingconst doc = Document.fromFile("contract.pdf", { chunkSize: 128, // index-time: how the doc is split strategy: "auto", // size-gated: prune only under dilution});const ctx = doc.context(query, 2000); // query-time: vary freely, no re-indexinguse redhop::{Document, DocumentConfig};
let cfg = DocumentConfig { target_tokens: 128, ..Default::default() };let mut doc = Document::from_text_with("doc", text, cfg)?; // config-aware constructorlet ctx = doc.context_with(query, Some(2000), None)?; // per-query budgetchunk_size is fixed at construction (it’s how the index is built); the per-query
budget is free to vary. Every parameter has a default — see Options for
the full list.
Next: Loaders — every way to get documents in · Overview — the one idea, and how it works · Retrieval options — when BM25 isn’t enough.