How To Build Rag Pipeline Fetchium

A Retrieval-Augmented Generation (RAG) pipeline needs three things: a way to search the web, a way to extract clean content, and a way to pass that content to your LLM. Most teams stitch together a SERP scraper, a web crawler, and a token counter. Fetchium replaces all three with one API call.

Prerequisites

Python 3.11+
A Fetchium API key (free at app.fetchium.com)
langchain and fetchium Python packages

Step 1: Install the packages

pip install langchain fetchium

Step 2: Initialize the Fetchium retriever

from fetchium import FetchiumRetriever

retriever = FetchiumRetriever(
    api_key="your_api_key",
    k=5,                    # number of results
    token_budget=4096,      # max tokens per result
    extract_content=True    # full CEP extraction
)

Step 3: Build the RAG chain

from langchain.chains import RetrievalQA
from langchain.llms import Anthropic

chain = RetrievalQA.from_chain_type(
    llm=Anthropic(model="claude-3-5-sonnet"),
    chain_type="stuff",
    retriever=retriever
)

result = chain.run("What are the best async patterns in Rust?")
print(result)

The retriever automatically handles multi-backend search, content extraction, token budgeting, and citation tracking. Your LLM receives clean, relevant content ready to use.

What Fetchium does behind the scenes

Dispatches your query to 17 backends in parallel (DuckDuckGo, Brave, GitHub, StackOverflow, and more)
Ranks results using HyperFusion — 8 signals including BM25, semantic similarity, and source authority
Extracts clean content from each result URL using the 5-layer CEP pipeline
Packs the most relevant content into your 4,096-token budget using QATBE
Returns structured citations for every fact

How to Build a RAG Pipeline with Fetchium and LangChain

Prerequisites

Step 1: Install the packages

Step 2: Initialize the Fetchium retriever

Step 3: Build the RAG chain

What Fetchium does behind the scenes

Further reading

Related Articles