TL;DR
Hardcoding chunking method, embedding model, and retrieval mode into your pipeline means every config change is a code change. The fix is a strategy abstraction: define and benchmark configs in RAG Lab, save the ones that score well, then reference them by name in production via the SDK. The pipeline code does not change when the strategy does.
The first version of a RAG pipeline usually looks like a single function that does everything. It chunks the documents, embeds them with a specific model, runs vector search, and returns results. It works. And then you need to change something.
Maybe you want to try semantic chunking instead of recursive. Maybe you want to test whether hybrid search scores better than pure vector. Maybe you want to compare two embedding models side by side. In a hardcoded pipeline, each of those changes means editing the pipeline code itself.
You end up with commented-out lines, if model == "small" branches, and a function that nobody wants to touch. Testing requires running the whole pipeline. Reusing a config that worked on a previous project means copy-pasting code.
There is a better structure. It requires one mental shift: separate configuration from execution.
The Hardcoded Pattern
Here is the pipeline most teams end up with after the first working prototype:
def run_rag(query, documents): # Chunking hardcoded chunks = recursive_chunk(documents, size=256, overlap=20) # Model hardcoded embeddings = openai_embed(chunks, model="text-embedding-3-small") # Retrieval mode hardcoded results = vector_search(embeddings, query, top_k=5) return resultsThis works for one config. The problems appear when you want a second one:
Every change is a code change. Every test is a full pipeline run. The config and the logic are fused together, and pulling them apart later is painful.
The Strategy Abstraction
The fix is to treat your pipeline configuration as a named, reusable artifact that lives outside your code. Define what chunking method, embedding model, and retrieval mode to use. Benchmark it against your actual documents. Save the configs that score well. Reference them by name in production.
RAG Lab is where the config lives. The SDK is how you execute against it. The two layers are deliberately separate.
| Layer | Where | What it does |
|---|---|---|
| Config | RAG Lab | Define chunking, model, retrieval mode. Benchmark with a gold set. Save the winner. |
| Execution | SDK | Reference the saved strategy by name. Embed texts. Get vectors back. |
Built-in Presets as a Starting Point
Four preset strategies are available without any configuration. Each covers a different point on the cost-quality curve:
ghostbalancedscholarhybridYou can use any preset immediately with the SDK. No setup, no config file, no saved strategy required:
from decompressed_sdk import DecompressedClientdc = DecompressedClient(api_key="dck_your_key_here")# Use a preset by IDresult = dc.lab.embed( texts=["Document 1", "Document 2"], preset_id="balanced" # ghost=Economy | balanced=Balanced | scholar=High Accuracy | hybrid=Hybrid Search)print(f"Model: {result.model}")print(f"Dimensions: {result.dimensions}")print(f"Tokens used: {result.usage['token_count']}")Start with Economy (ghost) for fast, cheap iteration and Balanced (balanced) when you need higher precision. Run both against your gold set in RAG Lab before committing to either in production.
Saving and Reusing Custom Strategies
Presets cover common cases. For production use, you want a strategy tuned to your specific corpus. The process is: benchmark in RAG Lab, save the config that wins on your Recall@K and MRR numbers, then reference it by name in your application code.
Once a strategy is saved in RAG Lab, you can reference it by name, by display name, or by its UUID:
# Reference a saved strategy by nameresult = dc.lab.embed( texts=["Document 1", "Document 2"], strategy="My High-Accuracy Config")# Or by UUID for exact, unambiguous referenceresult = dc.lab.embed( texts=["Document 1", "Document 2"], strategy_id="abc-123-def-456")print(f"Strategy used: {result.strategy_name}")print(f"Base cost: ${result.usage['base_cost_usd']:.6f}")print(f"Remaining tokens: {result.usage['remaining_tokens']}/{result.usage['token_limit']}")The pipeline code does not change when you switch strategies. You change the name passed to embed(), not the pipeline logic. That is the separation that matters.
Listing Available Strategies
To see all presets and your saved strategies at any point:
available = dc.lab.list_strategies()# Built-in presetsfor preset in available["presets"]: print(f"{preset['id']}: {preset['name']} ({preset['model']}, {preset['search_type']})")# Your saved strategiesfor strategy in available["saved_strategies"]: print(f"{strategy['name']} — used {strategy['usage_count']} times") print(f" model: {strategy['model']}, search: {strategy['search_type']}")The usage_count on a saved strategy is how many times it has been called via the SDK. It gives you a signal about which configs are actually being used in production versus which ones were tested and abandoned.
The Pipeline Wrapper Pattern
The separation between config and execution makes a clean pipeline wrapper possible. The pipeline function takes a strategy reference and a list of texts. It delegates config resolution to the SDK. Your application code never needs to know what model or chunking method is in use:
from decompressed_sdk import DecompressedClientdc = DecompressedClient(api_key="dck_your_key_here")def embed_documents(texts, strategy_name): """Embed a list of texts using a named strategy from RAG Lab.""" result = dc.lab.embed( texts=texts, strategy=strategy_name, ) return result.embeddingsdef embed_with_preset(texts, preset_id="balanced"): """Embed using a built-in preset. Default: balanced.""" result = dc.lab.embed( texts=texts, preset_id=preset_id, ) return result.embeddings# Application code references the strategy name, not the configembeddings = embed_documents(chunks, strategy_name="Legal Doc Retrieval v2")When you want to swap to a new strategy, you update the name string. The embedding logic, error handling, and billing are all handled by the SDK. The only thing that changes is which config gets resolved.
Strategy names are case-sensitive when referenced by name. Use strategy_id for exact, stable references in production code. Names can change; UUIDs do not.
Upload a document, compare strategies, save the winner, embed in production with one line
Open RAG LabWhat This Unlocks at Scale
For small projects, swapping a name string in one place is a minor convenience. At scale, the separation between config and execution becomes more important.
Different document types in the same application can use different strategies. Support tickets might perform best with ghost (fast, cheap, good enough for general text). Legal contracts might need scholar (hybrid search with reranking for precision). You reference each strategy by name where it is relevant. The pipeline code for both paths looks identical.
When you improve a strategy through re-evaluation in RAG Lab, production automatically uses the updated config on the next embed call. No deploy required for a config change.
The teams that rewrite their pipelines every time they want to try something new are the ones who delayed separating config from logic. The ones who move fast at scale separated them early, often before they fully understood why it mattered.
RAG Lab is the benchmarking layer. The SDK is the execution layer. Test chunking methods, embedding models, and retrieval modes side by side, save the winning config, and reference it by name in production. No pipeline rewrites required.
Build your first strategy