Where LangChain Fits: Adding RAG Inside a LangGraph Workflow
In this series, we built a production-ready orchestration system using:
- LangGraph for stateful workflows
- MCP for tool isolation
- Ollama for drafting
- OpenAI for supervision
- Slack for human approval
But where does LangChain fit?
Short answer:
LangGraph controls the system. LangChain builds intelligence inside the steps.
Letβs make that precise.
What Is LangChain?
LangChain is a Python framework for building LLM-powered components such as:
- Prompt templates
- Model wrappers
- Tool-calling pipelines
- Retrieval systems (RAG)
- Chains of operations
LangChain is excellent for linear composition.
Example mental model:
Input β Prompt β Model β Output
LangGraph is excellent for stateful control flow.
Example mental model:
Node A β Node B β (loop?) β (branch?) β interrupt β resume
LangChain builds cognition.
LangGraph builds orchestration.
How to Install LangChain
Inside your existing project:
pip install langchain langchain-core langchain-community chromadb
If using Ollama embeddings:
pip install langchain-ollama
You do not replace LangGraph.
You add LangChain as a dependency for selected nodes.
Updated Stack Diagram
HUMAN GOVERNANCE
|
LangGraph (Control Plane)
|
+-- Node: draft_with_rag
|
+-- LangChain RAG Chain
|
+-- Retriever (Vector Store)
+-- Prompt Template
+-- Model (Ollama)
|
+-- Supervisor Node (OpenAI)
|
+-- Slack Approval (Interrupt)
|
+-- MCP Tool Calls
LangChain lives inside a node.
Thatβs the key design principle.
Adding Retrieval to the Newsletter Workflow
We upgrade drafting with contextual memory.
Goal:
- Maintain tone across newsletters
- Reference past posts
- Avoid repetition
- Improve continuity
Step 1 β Build a Vector Store
app/rag.py
from pathlib import Path
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_ollama import OllamaEmbeddings
def build_vector_store(path="blog_posts"):
docs = []
for file in Path(path).glob("*.md"):
loader = TextLoader(str(file))
docs.extend(loader.load())
splitter = RecursiveCharacterTextSplitter(
chunk_size=800,
chunk_overlap=100
)
splits = splitter.split_documents(docs)
embeddings = OllamaEmbeddings(model="nomic-embed-text")
return Chroma.from_documents(splits, embeddings)
Step 2 β Create a RAG Chain
from langchain.prompts import ChatPromptTemplate
from langchain_ollama import ChatOllama
def build_rag_chain(vector_store):
retriever = vector_store.as_retriever()
prompt = ChatPromptTemplate.from_template("""
You are drafting a newsletter.
Maintain tone continuity using retrieved context snippets.
Context:
{context}
Intro:
{intro}
Links:
{links}
""")
model = ChatOllama(model="kimi-k2.5")
return retriever | (lambda docs: {"context": docs}) | prompt | model
Step 3 β Integrate into LangGraph Node
vector_store = build_vector_store()
rag_chain = build_rag_chain(vector_store)
def node_draft_with_rag(state):
result = rag_chain.invoke({
"intro": state["intro_text"],
"links": "\n".join(state["blog_links"])
})
state["newsletter_md"] = result.content
return state
Replace your draft node.
Everything else remains untouched.
Thatβs architectural discipline.
LangChain vs LangGraph (Clear Comparison)
| Feature | LangChain | LangGraph |
|---|---|---|
| Linear chains | β | β |
| Loops | β | β |
| Conditional routing | β | β |
| Interrupt/resume | β | β |
| Checkpoint persistence | β | β |
| RAG support | β | (via LangChain inside nodes) |
| Tool abstraction | β | β (via MCP boundary) |
LangChain builds components.
LangGraph builds systems.
Performance Considerations
Adding RAG introduces:
- Embedding cost
- Retrieval latency
- Vector store storage
Mitigation strategies:
- Pre-build vector store
- Cache embeddings
- Limit chunk size
- Retrieve top 3β5 chunks only
For newsletters, this overhead is small.
For large systems, consider:
- FAISS for in-memory speed
- Persistent Chroma storage
- Hybrid retrieval
Why This Integration Is Clean
We did not:
- Replace orchestration
- Break approval gates
- Bypass MCP tools
- Remove logging
We enhanced one cognitive step.
Thatβs good system design.
Generalising This Pattern
You can use this structure for:
- Code generation with repository RAG
- Legal drafting with precedent retrieval
- Support agents with knowledge base access
- Security scanning with vulnerability database context
The control plane stays stable.
The cognitive layer evolves.
Final Stack Summary
Human β Slack Approval
|
LangGraph β Controls state, loops, retries
|
LangChain β Builds RAG and prompt chains inside nodes
|
Models β Generate + evaluate
|
MCP β Executes side effects
|
Artifacts β Stored per thread_id
Thatβs a layered AI system.
If youβd like, we can now:
- Update the Series Overview blueprint to include LangChain explicitly
- Add a small performance optimisation section
- Or create a final visual stack block combining everything
You truly have it all now β and it still makes architectural sense.
Thatβs the best kind of βhaving it all.β