WhatsApp AI Agent with OpenClaw: The 2026 Agency Setup Guide After Meta's Chatbot Crackdown
A WhatsApp AI agent built on OpenClaw is a self-hosted assistant that links to a WhatsApp number through the multi-device protocol. Here is how to set one up in 2026 with the new replyToMode, per-group system prompts, and a complete CLI walkthrough.
Last updated: April 26, 2026
A WhatsApp AI agent built on OpenClaw is a self-hosted assistant that links to a WhatsApp number through the multi-device protocol and replies to inbound messages with a Claude-powered, tool-using AI worker. The setup uses the same QR code WhatsApp Web uses, so the agent acts as a linked companion device on the user's phone rather than a Meta Cloud API number. That distinction got loud in January 2026, when Meta banned open-ended AI chatbots from the official WhatsApp Business Cloud API and forced the entire ecosystem to rethink which path to take. OpenClaw's WhatsApp channel is the most popular workaround in the agency world right now, and the 2026.4.22 release made it considerably more useful for multi-tenant deployments.
Key takeaways
- OpenClaw connects to WhatsApp through Baileys, the open-source implementation of the WhatsApp Web multi-device protocol. No Meta Cloud API account is needed for inbound conversational AI.
- Meta banned mainstream AI chatbots from the Cloud API on January 15, 2026. Multi-device companion devices are the practical path for 1:1 reply-driven agents.
- OpenClaw 2026.4.22 added a
replyToModeoption, per-group and per-direct system prompts, and a fix for duplicate messages on reconnect. - A working setup is one channel block in
config.yml, one QR scan, and a persistent volume mount for the Baileys session keys. - Cloud API still wins for high-volume template broadcast (over 50K/day). Multi-device wins for conversational, reply-first AI workers.
- The
per-channel-peerdmScope default keeps every WhatsApp peer in their own isolated context, which is what regulated industries actually need.
What an OpenClaw WhatsApp AI agent actually does
The job is narrow and it is concrete. A user sends a WhatsApp message to a phone number. The OpenClaw gateway, running on a VPS or a home server, receives that message through the Baileys WebSocket connection. The gateway resolves a session key, loads the right context for that conversation, hands the message to a Claude or Grok or local model, and writes the agent's reply back to WhatsApp. The user sees a normal chat thread, with read receipts and typing indicators that look like every other WhatsApp conversation.
The interesting part is what happens between the message arriving and the reply going out. The agent has access to the full OpenClaw tool surface: it can read a calendar, query a CRM, run a Stripe lookup, hit any MCP connector the gateway has wired up, and call into per-client skills. A dental front desk agent answers a "do you take Delta Dental" question by checking a Supabase row. A real estate agent answers "what's the price on 412 Maple" by hitting a custom MLS skill. The reply that comes back to WhatsApp is grounded in real data, not hallucinated.
This is the difference between a chatbot and a worker. The chatbot reads a message and produces a string. The worker reads a message, looks something up, takes an action, and then produces a string. WhatsApp is just the channel. The intelligence lives in the gateway.
Why 2026 changed the WhatsApp AI landscape
Three shifts hit the WhatsApp AI world inside a single quarter, and they pushed serious agencies toward the multi-device path.
The first was the AI chatbot ban. On January 15, 2026, Meta updated its WhatsApp Business Cloud API policy to require that automated bots have "clear, predictable results associated with business messaging." Open-ended AI chat is out. Support flows, booking flows, and order flows are in. If an agency built its product on top of the official Cloud API and routed every inbound message to Claude or GPT for a freeform reply, that product became a policy violation overnight.
The second was the pricing change. Meta moved to a per-template-message model on April 1, 2026, scrapping the old "free first 1,000 conversations per month" tier and pricing utility, authentication, and marketing templates separately by destination country. For an agency running an AI worker that holds dozens of multi-turn conversations per client per day, the per-message economics quickly stop working.
The third was the rollout of WhatsApp usernames, scheduled to begin in test countries in June 2026, with a new business-scoped user identifier (BSUID) replacing phone numbers in webhooks. That is a longer-arc change that complicates how Cloud API integrations resolve identity, while multi-device companion devices keep working unchanged because they ride on the existing WhatsApp Web protocol.
Add it up and the shape of the problem is clear. If your agency is running 1:1 conversational AI on WhatsApp in 2026, the multi-device protocol is the path with the fewest landmines. OpenClaw's WhatsApp channel was already built that way, which is why it became the default option for the kind of work agencies and GHL resellers were trying to do.
Cloud API vs multi-device: which path is yours
The honest answer is that they are different products with different jobs. Cloud API is a transactional broadcast pipe optimized for templates: shipping notifications, OTP codes, appointment reminders sent at scale. Multi-device is a conversational pipe optimized for replies inside an existing thread, the way a human would respond from their phone.
If your use case is "I need to send 250,000 marketing templates a month," Cloud API is correct, and you should accept the policy and pricing constraints. If your use case is "I need to be the AI front desk for fifteen dental practices, each with a phone that already has WhatsApp," multi-device is correct, and OpenClaw is the cleanest way to wire it up.
Most Kyra-style agencies live entirely in the second world, which is why this guide focuses there. For the broader picture of how the same gateway handles other channels, the session keys deep dive walks through how Slack, Discord, and Telegram fit alongside WhatsApp in the same install.
Step-by-step: connect OpenClaw to WhatsApp in fifteen minutes
The walkthrough below assumes you already have OpenClaw 2026.4.22 or later installed and the gateway listening on its default port. If you do not, the "What is OpenClaw" overview and the official WhatsApp channel docs cover the install side.
1. Confirm the gateway is running and reachable.
openclaw gateway status
# expected: listening on :18789, 0 active sessions
2. Create the persistent volume for Baileys session keys. This is the most common reason new installs lose the QR scan and ask you to scan again. Baileys writes the multi-device handshake material into a folder, and that folder must survive container restarts.
mkdir -p ~/.openclaw/whatsapp/auth
chmod 700 ~/.openclaw/whatsapp/auth
3. Add the WhatsApp channel block to your gateway config.
$EDITOR ~/.openclaw/config.yml
channels:
- type: whatsapp
sessionPath: ~/.openclaw/whatsapp/auth
replyToMode: smart
dmScope: per-channel-peer
systemPrompt: |
You are the front desk for Acme Dental.
Answer scheduling questions and route insurance
questions to a human if Delta Dental is mentioned.
groups:
enabled: true
systemPromptOverrides:
- groupId: "120363045xxx@g.us"
prompt: |
You are the staff coordinator for Acme Dental.
Reply only when explicitly @mentioned.
4. Reload the gateway.
openclaw gateway reload
5. Trigger the QR scan and link the device.
openclaw whatsapp link --print-qr
# scan the QR with: WhatsApp > Settings > Linked Devices > Link a device
The QR code appears in the terminal as ASCII art. On the phone, open WhatsApp, navigate to Settings, then Linked Devices, then "Link a device," and point the camera at the terminal. The link completes in a few seconds and the gateway prints a "session bound" line.
6. Send a test message from a second WhatsApp account. Open WhatsApp on a different phone, send "hello" to the linked number, and watch the gateway log.
openclaw logs --channel whatsapp --tail
# expected: session resolved wa:+15551234567
# expected: agent reply dispatched in 1.4s
7. Verify isolation with a second peer. Send a different message from a third phone number. The two conversations must produce two different session keys (one per peer) and the agent must not leak context between them.
openclaw sessions list --channel whatsapp
# expected: two rows, one per peer phone number
That is the full setup. Seven commands, one config block, one QR scan. Every additional WhatsApp number is one more channel block with a different sessionPath, which is how an agency runs fifteen client phones on a single gateway.
What is new in OpenClaw 2026.4.22 for WhatsApp
The April 22 release was a small one in line count and a large one in practical impact. Three changes are worth knowing.
The replyToMode option controls whether the agent quotes the original message when it replies. Three values: off never quotes, always always quotes, and smart quotes only in groups or when the reply is to a message that arrived more than 60 seconds earlier. smart is the right default for almost every deployment because it keeps DMs feeling like a normal one-on-one chat while still pinning replies in noisy group threads.
The per-group and per-direct system prompts let one WhatsApp connection serve multiple conversational personas. The agent can be a clinical front desk in the patient DM thread and a concise staff coordinator in the internal staff group, with two completely different system prompts loaded based on the session key. Before this release, you needed two separate WhatsApp connections to do that.
The duplicate message fix closed a long-standing reconnect bug. When the Baileys WebSocket dropped and reconnected, pending message queues were re-driven by both the old and new connection, occasionally producing duplicate replies. The 2026.4.22 release adds an in-memory "active delivery claim" that prevents two reconnects from racing the same queue entry. The visible fix is that the agent stops sending the same reply twice during flaky internet.
Session keys and per-peer isolation on WhatsApp
Every inbound WhatsApp message arrives at the gateway with a small metadata bundle: the linked-device account that received it, the peer phone number that sent it, and whether the message landed in a DM or a group. The session manager turns that bundle into a deterministic string. A typical WhatsApp DM produces a key like wa:+15551234567; a WhatsApp group produces something like wa:120363045xxx@g.us.
The default dmScope setting is per-channel-peer, which means each peer phone number gets its own session and its own conversational memory. A patient asking about an appointment last Tuesday gets the context from last Tuesday's chat, not the context from a different patient who happens to share the same dental practice. Group chats always get their own session regardless of dmScope, because every group is a first-class room with its own membership and its own privacy expectations.
This is not optional infrastructure. Regulated industries on WhatsApp (dental, legal, medical, financial) need the agent to never confuse one peer's data with another's, and the session key boundary is what enforces that at the routing layer before the model ever sees the prompt. The deeper mechanics are spelled out in the session keys explainer if you want the full algorithm.
Comparison: WhatsApp AI deployment patterns in 2026
Four patterns dominate the 2026 landscape for putting an AI agent on a WhatsApp number. They are not interchangeable, and the right pick depends on whether you care more about volume, conversation quality, compliance, or operational simplicity.
| Pattern | Connection | Typical use case | 2026 reality |
|---|---|---|---|
| OpenClaw + Baileys | Multi-device companion (QR scan) | Conversational AI worker, agency multi-tenant | Default for replies; no per-message fees |
| Meta Cloud API + classic flow | Official Business API, template-driven | Notifications, OTP, scheduled broadcasts | Required for >50K/day; AI-chat banned since Jan 15, 2026 |
| Cloud API via BSP middleware | Twilio/MessageBird/360dialog wrappers | Mid-volume mixed flows | Same Meta policies; vendor markup added |
| WhatsApp MCP server | Local Baileys + MCP protocol | Personal assistant for one operator | Great for solo use, weak for multi-tenant ops |
The two paths agencies should evaluate first are OpenClaw + Baileys for inbound conversational work and Meta Cloud API for outbound transactional templates. Many production deployments end up running both side by side: OpenClaw answers replies, Cloud API ships the appointment reminder that started the thread. The two systems do not conflict because each handles a different message direction.
When OpenClaw on WhatsApp is not for you
The multi-device path is not the right answer for every shop, and pretending otherwise is how agencies blow up a client account.
You need to send more than 50,000 outbound templates per day. Multi-device companion devices are rate-limited the same way a human user is. If your job is high-volume broadcast, you want Cloud API or a BSP, and you should accept the per-template pricing as the cost of the channel.
Your client is contractually required to use the Meta Business Platform. Some healthcare networks, financial institutions, and regulated brands require an officially provisioned Cloud API number with a green checkmark. Multi-device companion devices do not appear in the Business Platform dashboard, and audits will fail.
You need WhatsApp's official Flows or list-message templates. Some interactive UI primitives (carousels, native list pickers, structured forms) are exclusive to the Cloud API. Baileys can fake a lot of the experience with text and quick replies, but if your client demands the native flow widgets, the multi-device path will not deliver them.
You are unwilling to keep the linking phone online. Multi-device sessions stay alive even with the linking phone off, but the phone must occasionally come online to refresh the link. If the linked phone is permanently unavailable (lost, stolen, formatted), the session breaks and the next inbound message fails until you scan a new QR. Cloud API has no such constraint.
Outside those four conditions, OpenClaw on WhatsApp is the path most agencies should default to in 2026.
Frequently asked questions
Does OpenClaw need a verified Meta Business account?
No. The OpenClaw WhatsApp channel uses the multi-device protocol via Baileys, which is the same protocol WhatsApp Web uses. You scan a QR code with the phone that owns the number and the gateway acts as a linked companion device. There is no Meta Business onboarding, no display name approval, and no template review process. If the phone has WhatsApp installed and can scan a QR code, the agent can connect.
Will Meta ban my number for using a Baileys-based AI agent?
Meta does not publish enforcement criteria for multi-device clients, and bans do happen for accounts that send spam or template-style broadcast through the personal protocol. Conservative usage (replying to inbound messages, normal conversational pace, no mass cold outreach) has a long track record of staying inside the lines. Use the channel for replies, not for cold blast campaigns, and keep the per-day outbound volume in human ranges.
What happens if the Baileys session expires?
The gateway logs an "auth state invalidated" line and pauses the channel. Run openclaw whatsapp link --print-qr again, scan with the same phone, and the channel resumes. Existing conversations keep their session keys, so the agent's memory survives the relink. The persistent volume mount on sessionPath is what makes this safe; without it, every restart is a relink.
Can one OpenClaw gateway run multiple WhatsApp numbers?
Yes. Add one channel block per number, each with its own sessionPath directory and its own systemPrompt. The session key includes the linked-device account identifier, so messages from different numbers route to different conversations and never overlap. This is the standard pattern for agencies hosting fifteen or twenty client numbers on a single gateway.
How does this interact with Anthropic's Claude Code Channels?
Anthropic's Claude Code Channels shipped in early 2026 with native Discord and Telegram support, but no first-party WhatsApp connector. The community is expected to build one through the open MCP standard, and OpenClaw already covers the WhatsApp side directly today. If your stack is Claude Code centric and you need WhatsApp now, OpenClaw is the bridge that exists.
Does the agent see WhatsApp end-to-end encrypted messages in plaintext?
Yes, by design. A linked companion device is an authorized endpoint inside WhatsApp's E2EE model, just like the user's phone or laptop. Messages are decrypted on the OpenClaw host and the model receives plaintext. This is what enables the agent to reply at all. It also means the host should be treated as sensitive infrastructure: full disk encryption, locked-down SSH, and no sharing the box with untrusted workloads.
The smallest piece of infrastructure your agency probably underestimates
WhatsApp is not exotic. It is the channel a billion small businesses already use, and the AI agent that lives inside it does not have to be exotic either. One config block, one QR scan, one persistent volume, and a Claude model on the back end is genuinely the whole thing. The work that used to take a Meta Business onboarding, a template approval queue, and a BSP contract now takes fifteen minutes. The 2026 changes to the official Cloud API made multi-device the right default for conversational AI, and OpenClaw 2026.4.22 made it cleaner to operate.
If you would rather skip the VPS and the QR refresh dance and just hand a working WhatsApp number to a client, Kyra runs the OpenClaw gateway, the Baileys session storage, and the per-client isolation on a managed host with the same defaults this guide describes. Industry-specific starting templates are ready for dental practices and real estate agencies, and the underlying primitives are documented in the OpenClaw WhatsApp reference and the OpenClaw repository on GitHub. WhatsApp is where most of your clients' customers already are. Putting an AI worker there in 2026 is no longer the hard part of the project.
The Kyra Team
Conversion System
We build white-label AI workforce infrastructure for digital agencies on top of OpenClaw. We publish practical guides on deploying AI agents, self-hosted AI, and multi-channel workforce design.
Try Kyra free
No credit card. Powered by OpenClaw. First AI worker live in under 2 minutes.
Related reading
AI Infrastructure
AI Agent Memory Systems in 2026: How OpenClaw Workspaces, SOUL.md, and Context Compaction Actually Work
13 min read
AI Infrastructure
Self-Hosted AI Cost vs Cloud LLM Bills in 2026: The Honest Math for Agencies
16 min read
AI Infrastructure
Per-Client AI Container Isolation in 2026: How Agencies Run 50+ AI Workers Without Cross-Contamination
12 min read