- What is LangChain?
- Models — The Brain
- Messages — The Building Blocks of Conversation
- Prompts — Steering the Output
- Tools — The Hands
- Memory — Making the Agent Remember
- Putting It All Together
- Best Practices
- FAQ
- Summary
What is LangChain?
The Platform
LangChain was created by Harrison Chase in October 2022. It’s a platform for Agent Engineering — building AI agents that can reason, plan, and act.
LangChain is not just one framework. It’s a full platform with multiple components:
| Component | Purpose |
|---|---|
| LangChain | Build agents quickly, compatible with any model provider |
| LangGraph | Low-level agent control: Memory, Human-in-the-Loop |
| Deep Agents | Complex, multi-step task agents |
| LangSmith | Test, observe, evaluate, and deploy agents |
Official site: https://www.langchain.com
What is an Agent?
There’s no single standard answer. But LangChain’s founder Harrison Chase gives a technical definition:
An AI agent is a system that uses an LLM to decide the control flow of an application.
In simpler terms: an Agent is an intelligent system that can perceive its environment, reason, make autonomous decisions, and take action to achieve a goal.
| Feature | Traditional LLM | AI Agent |
|---|---|---|
| Interaction | Passive: ask one, answer one | Active: goal-driven planning |
| Execution | Text generation only | Operate software, send emails, analyze data |
| Autonomy | Needs explicit step-by-step instructions | Given a goal, finds its own path |
LLM = Brain. Agent = Brain + Hands + Logic.
LLM vs Agent: The Difference
Imagine building an AI Travel Assistant.
Traditional LLM approach:
- User: “Plan a 5-day Beijing trip, budget 8000 RMB, I love history.”
- LLM generates a simple plan from training data — no real-time prices, weather, or availability.
Agent approach:
- Same user request.
- Agent breaks it down:
- Calls flight/hotel APIs for real prices
- Calls weather API for forecast
- Searches attraction APIs for opening hours
- Adjusts plan dynamically (“The Forbidden City is closed Monday, moved to Tuesday”)
- Returns a concrete, executable plan within budget.
Models — The Brain
Models (LLMs) understand human language and generate content, translate, summarize, and answer questions.
Modern LLMs also have special capabilities:
| Capability | Description |
|---|---|
| Tool Calling | Call external tools (APIs, databases) and use results in responses |
| Structured Output | Constrain responses to a defined format (e.g., JSON) |
| Multimodality | Process and return non-text data: images, audio, video |
| Reasoning | Execute multi-step reasoning to reach conclusions |
LangChain supports most LLMs with a unified API, making it easy to switch providers.
Initializing a Model
The simplest way is init_chat_model:
1 | from langchain.chat_models import init_chat_model |
Test it:
1 | print(type(model)) # <class 'langchain_deepseek.chat_models.ChatDeepSeek'> |
Switching models? Just change the name — no other code changes needed.
Custom Models and Parameters
For providers not natively supported (like Alibaba’s Qwen via DashScope), you must specify parameters manually:
1 | import os |
Common model parameters:
| Parameter | What It Controls |
|---|---|
temperature |
Randomness: low = deterministic, high = creative |
max_tokens |
Maximum response length |
top_p |
Diversity of output |
timeout |
Request timeout |
max_retries |
Maximum retry attempts |
Using Model Classes
For community-supported models, use the specific Model class:
1 | uv add langchain-community dashscope |
1 | from langchain_community.chat_models.tongyi import ChatTongyi |
Calling the Model: invoke vs stream
Blocking call — wait for the full response:
1 | response = model.invoke("What is the capital of the moon?") |
Streaming call — see tokens in real time:
1 | stream = model.stream("What is the capital of the moon?") |
Using Models in an Agent
Pass a model name (auto-init) or a model object:
1 | from langchain.agents import create_agent |
Agent also supports invoke and stream:
1 | # Blocking |
Stream modes for Agent:
| Mode | Returns |
|---|---|
messages |
Each LLM token (for real-time output) |
updates |
Every Agent event (LLM calls, tool calls) |
custom |
Custom stream writer output |
Messages — The Building Blocks of Conversation
Every message sent to or from the LLM contains:
- role: who sent this message (system, user, assistant, tool)
- content: the message body
- metadata (optional): ID, token usage, etc.
Message Types
LangChain wraps messages into BaseMessage subclasses based on role:
| Class | Role | Purpose |
|---|---|---|
SystemMessage |
system |
Set model behavior and context |
HumanMessage |
user |
User input |
AIMessage |
assistant |
LLM response (text, tool calls, metadata) |
ToolMessage |
tool |
Result of a tool execution |
Example:
1 | from langchain.messages import HumanMessage, AIMessage, SystemMessage |
Output:
1 | ================================ System Message ================================ |
Tip: Manually passing message history gives the LLM “memory.” But this is tedious — we’ll learn automatic memory management in the Memory section.
Multimodal Messages
LangChain supports sending images, audio, and video to multimodal models (e.g., qwen3.5-plus, gpt-5-nano).
Online image (via URL):
1 | from langchain.chat_models import init_chat_model |
Local image (via base64):
1 | import base64 |
Prompts — Steering the Output
Everything sent to the LLM can be called a Prompt. The System Prompt (SystemMessage) is the most important — it sets the AI’s role, rules, and context.
System Prompts
Set it once when creating the agent:
1 | from langchain.agents import create_agent |
Without a system prompt, the AI uses its default persona. With Speak like a pirate, the output changes dramatically:
1 | Ahoy! I be yer parrot, an AI assistant sailing the digital seas! Want to chat about treasure, sailing, or tales of the seven seas? Bring it on, matey! |
Prompt Engineering
Prompt Engineering is the iterative process of optimizing the System Prompt for better outputs.
A well-structured prompt typically includes:
| Section | Purpose |
|---|---|
| Identity | Who is the AI? Communication style, overall goal |
| Instructions | Rules to follow. What to do and what NOT to do |
| Examples | Input/output pairs showing the desired format |
| Context | Extra information (RAG data, reference documents) |
Use Markdown for structure and XML tags to mark boundaries:
1 | system_prompt = """ |
Few-Shot Examples
When the desired style is hard to describe, show examples instead:
1 | system_prompt = """ |
Output:
1 | Aurum — a magnificent floating city above the sulfuric clouds, forged from reflective alloy, eternally refracting the dim yellow sunlight. |
Structured Output
Instead of parsing raw text, constrain the model to output structured data.
Step 1: Define a Pydantic model:
1 | from pydantic import BaseModel, Field |
Step 2: Pass it as response_format:
1 | agent = create_agent( |
Output:
1 | Lunara is located at the edge of the Aitken Basin on the lunar south pole. Vibe: A serene city blending high-tech with classical Eastern aesthetics, featuring transparent domed gardens and floating architecture. Economy: Helium-3 mining, quantum computing centers, space tourism, lunar agriculture and scientific research. |
Tools — The Hands
An agent needs at least two parts:
- Model: The brain (reasoning, planning)
- Tools: The hands (executing tasks, interacting with the outside world)
Defining Tools
Use the @tool decorator:
1 | from langchain.tools import tool |
Output:
1 | ================================ Human Message ================================= |
How it works:
- User asks a question
- LLM analyzes: “I don’t know the weather, I need the
get_weathertool” - LLM returns a JSON with the tool name and arguments
- LangChain parses it and calls the function
- Result goes back to the LLM
- LLM generates the final answer
1 | User Question → LLM Reasoning → Tool Call → Tool Result → LLM Final Answer |
Custom Tools
By default, tool metadata comes from:
| Info | Source |
|---|---|
| Tool name | Function name |
| Tool input | Function parameters |
| Tool description | Docstring |
Override with the decorator:
1 |
|
For detailed parameter constraints, use Pydantic:
1 | from pydantic import BaseModel, Field |
Important: Two parameter names are reserved in LangChain tools — config and runtime. Don’t use them as custom parameter names.
Predefined Tools: Tavily Search
LangChain includes many pre-built tools. A popular one is Tavily for web search.
Step 1: Register at https://www.tavily.com, get your API key.
Step 2: Add to .env:
1 | TAVILY_API_KEY=your_key_here |
Step 3: Install dependency:
1 | uv add langchain-tavily |
Step 4: Use it:
1 | from langchain_tavily import TavilySearch |
Combining Tools with an Agent
1 | from langchain.agents import create_agent |
With stream_mode="updates", you can see each step:
1 | for chunk in agent.stream( |
Output:
1 | step: model |
Memory — Making the Agent Remember
Models have no memory by default. Each call is independent.
Short-Term vs Long-Term Memory
Don’t confuse the names! Short-term ≠ temporary, Long-term ≠ permanent.
| Short-Term Memory | Long-Term Memory | |
|---|---|---|
| Lifecycle | Current session | Across sessions/tasks |
| Content | Current task state | Knowledge, experience, preferences |
| Cross-task | ❌ | ✅ |
| Storage | Redis / In-memory | DB / Vector DB |
Short-term memory = working context for the current conversation.
Long-term memory = accumulated experience and knowledge across conversations.
Short-Term Memory with InMemorySaver
LangChain automatically manages conversation history via Checkpointer objects.
1 | from langchain.agents import create_agent |
Output:
1 | ================================ Human Message ================================= |
Key point: Same thread_id = same conversation. Different thread_id = isolated conversations.
Persistent Memory (Optional)
InMemorySaver loses data on restart. For production, use a persistent checkpointer.
SQLite example:
1 | uv add langgraph-checkpoint-sqlite |
1 | import sqlite3 |
Other options: PostgresSaver, CosmosDBSaver.
Memory Management Strategies
As conversations grow, message history can exceed the LLM’s context window (e.g., 128K for DeepSeek). This causes:
- Context loss (forgotten earlier messages)
- Reduced response quality (“attention dilution”)
- Slower responses
Three strategies:
| Strategy | How It Works | Pros | Cons |
|---|---|---|---|
| Trim messages | Only send recent messages to the LLM | Simple, fast | Loses older context |
| Delete messages | Remove old messages from state entirely | Frees storage permanently | Irreversible |
| Summarize messages | Summarize old messages, keep recent ones raw | Preserves context + fits window | Extra LLM call for summarization |
SummarizationMiddleware
LangChain’s built-in solution:
1 | from langchain.agents import create_agent |
Output:
1 | ================================ Human Message ================================= |
Putting It All Together
1 | from langchain.agents import create_agent |
Best Practices
| Practice | Why It Matters |
|---|---|
Use thread_id per conversation |
Prevents memory leaking across users |
| Persist checkpointer in production | Don’t lose conversations on restart |
| Write clear tool docstrings | LLM relies on descriptions to choose tools |
| Use structured output | Eliminates fragile text parsing |
Set temperature=0 for factual tasks |
More deterministic, less hallucination |
| Use summarization for long conversations | Avoids context window overflow |
| Keep system prompts in Markdown + XML | Easier for LLM to parse boundaries |
FAQ
Q: Can I switch models without changing code?
A: Yes, just change the model name. LangChain handles the rest.
1 | # DeepSeek |
Q: What’s the difference between ChatModel and LLM?
A: ChatModel takes a list of messages and returns a message. LLM takes a string and returns a string. Most modern models use the Chat interface.
Q: Can an agent call multiple tools in one turn?
A: Yes. The LLM can return multiple tool calls simultaneously:
1 | # LLM may return both: |
Q: What happens if a tool fails?
A: The exception is caught and passed back to the LLM as a ToolMessage with the error. The LLM can then decide how to handle it (retry, report error, etc.).
Q: InMemorySaver vs persistent checkpointer?
A: InMemorySaver is for development. Use SqliteSaver, PostgresSaver, etc. for production where data must survive restarts.
Summary
We covered the five core components of LangChain:
| Component | Role | Key Takeaway |
|---|---|---|
| Model | The brain | Unified API, switch providers easily |
| Message | Conversation units | System/Human/AI/Tool types, supports multimodal |
| Prompt | Control output | System prompts + engineering + structured output |
| Tool | The hands | @tool decorator, custom + predefined, powers Agent action |
| Memory | Remember context | Short-term (checkpointer) + Long-term (cross-session) |
The Agent formula:
1 | Agent = Model + Tools + Memory + Prompts |
Next steps:
- Learn LangGraph for fine-grained Agent control
- Build multi-agent systems (MAS)
- Add RAG for domain-specific knowledge
- Deploy with LangSmith for monitoring