LangChain vs. direct API: when to use each
Use LangChain when you need: conversation memory across multiple turns, vector store integration for RAG, complex multi-step chains where each step feeds the next, or agent frameworks that handle the ReAct loop automatically. Use the direct OpenAI API for simple, single-step AI calls — adding LangChain overhead to a straightforward API call adds complexity without value.
pip install langchain langchain-openai langchain-community chromadb
Chains: composing multi-step LLM workflows
The modern LangChain expression language (LCEL) uses the pipe operator to compose operations. Each step output becomes the next step input:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser
prompt = ChatPromptTemplate.from_messages([
("system", "Classify email. Return JSON: {category: str, urgency: int 1-5}"),
("human", "{email_text}")
])
model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
parser = JsonOutputParser()
# Compose with pipe operator
chain = prompt | model | parser
# Single item
result = chain.invoke({"email_text": "Subject: Invoice overdue..."})
print(result) # {'category': 'BILLING', 'urgency': 4}
# Batch processing with concurrency
results = chain.batch([{"email_text": e} for e in emails], config={"max_concurrency": 5})RAG pipeline in 20 lines
from langchain_community.document_loaders import DirectoryLoader, TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.chains import RetrievalQA
# Load and index
docs = DirectoryLoader("./kb/", glob="*.txt", loader_cls=TextLoader).load()
chunks = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100).split_documents(docs)
vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings(model="text-embedding-3-small"),
persist_directory="./chroma_db")
# Create RAG chain
qa = RetrievalQA.from_chain_type(
llm=ChatOpenAI(model="gpt-4o", temperature=0),
retriever=vectorstore.as_retriever(search_kwargs={"k": 3}),
return_source_documents=True
)
result = qa.invoke({"query": "What is the refund policy?"})
print(result["result"])Agents with custom tools
from langchain.agents import create_openai_tools_agent, AgentExecutor
from langchain_core.tools import tool
from langchain import hub
@tool
def lookup_crm(company: str) -> str:
"""Look up company in CRM. Returns status and history."""
return f"Company: {company} | Active | Revenue: $45K | Since: 2022"
@tool
def search_news(query: str) -> str:
"""Search for recent news about a topic."""
# Your actual search API call here
return f"Recent news for {query}: [results here]"
llm = ChatOpenAI(model="gpt-4o", temperature=0)
prompt = hub.pull("hwchase17/openai-tools-agent")
agent = create_openai_tools_agent(llm, [lookup_crm, search_news], prompt)
executor = AgentExecutor(agent=agent, tools=[lookup_crm, search_news], max_iterations=8)
result = executor.invoke({"input": "Research Acme Corp for our enterprise plan"})
print(result["output"])Conversation memory for multi-turn systems
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
store = {}
def get_history(session_id):
if session_id not in store:
store[session_id] = ChatMessageHistory()
return store[session_id]
# Chain with per-session memory
chain_with_memory = RunnableWithMessageHistory(
prompt | model | StrOutputParser(),
get_history,
input_messages_key="input",
history_messages_key="history"
)
# Context is maintained across turns for the same session_id
r1 = chain_with_memory.invoke(
{"input": "My order #4521 has not arrived"},
config={"configurable": {"session_id": "user_123"}}
)
r2 = chain_with_memory.invoke(
{"input": "When was it supposed to arrive?"}, # References r1 context
config={"configurable": {"session_id": "user_123"}}
)For production, use SQLChatMessageHistory or RedisChatMessageHistory instead of the in-memory store to persist conversation history across server restarts.
Frequently asked questions
Use LangChain for: RAG pipelines (document loading, splitting, and retrieval chain is much faster to build), multi-step chains with complex data flow, agents with multiple tools, and systems needing conversation memory management. Use direct API calls for simple single-step AI processing — classification, extraction, generation from a single prompt. LangChain overhead adds no value for straightforward API calls.
LangChain has had rapid API changes across major versions. Production best practices: pin your version in requirements.txt (e.g., langchain==0.3.x), test upgrades in staging before production, and follow official migration guides for major version bumps. With pinned versions and proper testing, LangChain is reliable for production use.
Keep building expertise
The complete guide covers every tool and strategy.
Complete AI Automation Guide →ThinkForAI Editorial Team
Updated November 2024.

