OpenClaw 3 Months — 4 Hidden Traps and How AI Helped Optimize
Three months running OpenClaw AI Trading — found 4 hidden bottlenecks. AI helped analyze, optimize multi-model routing, and cut costs while improving quality.

Loading...
Three months running OpenClaw AI Trading — found 4 hidden bottlenecks. AI helped analyze, optimize multi-model routing, and cut costs while improving quality.

เราเล่าจากการทดลองจริงในแล็บ ไม่ใช่ทฤษฎี — และให้หลักฐานพูดแทน
จุดเจ็บ: ระบบมักพังตอนที่ไม่มีใครดู กว่าจะรู้ตัวก็เสียหายไปแล้ว
เดิมพัน: ระบบล่มเงียบ ๆ คือฝันร้ายของทุกทีม เพราะความเสียหายโตขึ้นทุกนาทีที่ไม่รู้
สิ่งที่เราทำในแล็บ: Three months running OpenClaw AI Trading — found 4 hidden bottlenecks. AI helped analyze, optimize multi-model routing, and cut costs while improving quality.
The AI Agent Team running on OpenClaw + Lark + n8n had been performing well for 3 months — 10 Bots, 53 Workflows, 25 Skills covering every department. But after a full system audit, 4 hidden traps were discovered that caused overspending on API costs, crash loop risks, and preventable errors.
After fixing all 4 issues — API costs dropped by ~50% (measured by token usage over 30 days before/after optimization), response speed improved by ~35% (measured by average response time over 30 days), errors reduced by half (measured from error logs over 30 days), and crash loops were prevented entirely.
A system that "works" does not mean it "works well" — like a car driven daily but never serviced. After a thorough audit, 4 hidden issues were found that would only get worse over time.
Claude Sonnet 4 ($3/million tokens) was used for all 25 skills — whether a simple task like "late attendance notification" or a complex one like "Sales Pipeline analysis." That meant overpaying by 30x for 76% of routine tasks.
Every container was set to restart: unless-stopped — if a service crashed, it would restart indefinitely. Each restart cycle called the API again, silently burning money while consuming CPU/RAM and slowing down other services.
The memory system had 22 files / 705 lines accumulated over 3 months with no summarization or cleanup. Every time a Bot started a task, it loaded the entire memory — the larger the memory, the slower the response and the higher the token cost.
Validation scripts existed (card_validator, verify-deployment) but required manual execution every time. When forgotten, Bots would report "done" while card formats were broken — forcing the team to fix and resend, wasting time on rework.
All 4 issues share a common root cause: "configured once and never revisited" — a universal trap for any AI system that scales quickly.
The diagram below shows the flow before and after optimization — hover over each node for details.
Every fix follows the same principle — prevention before occurrence is better than correction after the fact, because once the problem happens, API costs have already spiked and it is too late.
Routing rules were added to openclaw.json to assign models by task type — 6 skills use Sonnet 4 (analysis, complex reports) | 15 skills use Gemini Flash (notifications, summaries, routine). 76% of routine tasks now use a model that costs 30x less.
Docker restart policy was changed from unless-stopped to on-failure:5 with an additional rule in CLAUDE.md — "if a task fails 2 consecutive times, stop immediately and notify Admin." This prevents runaway bills from infinite restart loops.
New rules were established — memory files over 100 lines must be summarized, files older than 30 days must be reviewed, the index must stay under 50 lines, and duplicate memory creation is prohibited. This prevents context bloat and reduces tokens sent with every request.
A mandatory 5-point checklist was added before every delivery — card_validator.py passes → JSON/YAML syntax valid → numbers have commas → text is not duplicated → links are working. Errors are caught before reaching the team.
Take a visual break — Optimizing an AI Agent system is like maintaining a luxury resort. Every component must work in harmony, from infrastructure to cost management.
The key to reducing costs is sending tasks to the right model, not always the most expensive one. The Router analyzes skill type and keywords in the message, then decides automatically.
| Model | Price / M tokens | Used For | Skill Count | Share |
|---|---|---|---|---|
| Claude Sonnet 4 | $3.00 Premium | Pipeline analysis, Delivery, Exec Brief | 6 | 24% |
| Gemini Flash | $0.10 Budget | Notifications, summaries, health checks | 15 | 76% |
Not every task requires an expensive model — 76% of tasks in this system are routine work where Gemini Flash at $0.10 performs just as well.
The table below compares every aspect that changed — the "Improvement" column provides a quick overview at a glance.
| Aspect | ❌ Before | ✅ After | Improvement |
|---|---|---|---|
| Model Used | Sonnet 4 for all tasks (25 skills) $3/M every skill | Sonnet 4 → 6 skills, Flash → 15 skills $0.10/M for 76% | API cost down 40-60% |
| Restart Policy | unless-stopped on all containers unlimited restarts |
on-failure:5 for API services stops after 5 |
Crash loop prevention |
| Anti-Loop Rules | No fail-stop rule unlimited retries | Task fails 2x → stop + notify Admin circuit breaker | Lower error-state bills |
| Memory | 22 files / 705 lines, never summarized growing endlessly | Rule: > 100 lines → summarize, > 30 days → review auto-managed | Token use down 15-25% |
| Session Pruning | None — context accumulated without limit prompt growing endlessly | Context trimmed after 30 turns, max 20 tool results compact prompt | Token use down 20-30% |
| Verification | Scripts exist but require manual runs easy to forget | 5-point checklist mandatory before delivery auto-verify | Errors down 40-60% |
| Auto-start | None — reboot required manual start downtime | crontab @reboot → docker compose up auto in 30 seconds | Zero-touch reboot |
The numbers below represent combined results from fixing all 4 issues — Multi-Model Routing had the largest impact as it directly reduces costs.
| Container | Status | Restart Policy | Before | Reason |
|---|---|---|---|---|
| 🤖 openclaw | ✅ healthy | on-failure:5 | unless-stopped | Uses API → must limit restarts |
| 🔗 lark-mcp | ✅ healthy | on-failure:5 | unless-stopped | Connects to Lark API → must limit |
| ⚙️ n8n | ✅ healthy | on-failure:5 | unless-stopped | Runs workflows → must limit |
| 🏛️ egp-solver | ✅ healthy | on-failure:3 | unless-stopped | Uses heavy Chromium → stricter limit |
| 📊 dashboard | ✅ Up | unless-stopped | unless-stopped | Lightweight nginx, no API → unchanged |
| 💾 duplicati | ✅ Up | unless-stopped | unless-stopped | Backup service → unchanged |
Key takeaways from this audit — applicable to any AI system, not just OpenClaw.
76% of tasks in this system are routine — notifications, summaries, health checks. Gemini Flash at $0.10 handles them just as well. Paying $3.00 every time is unnecessary. Choosing the model that "fits" the task matters more than choosing the "best" model.
The Bots worked every day, but hidden costs, hidden risks, and hidden latency were accumulating. Regular audits are like car maintenance — waiting until something breaks means the repair bill will be many times higher.
Anti-Loop + Auto-Verify = preventing problems before they occur. Once an issue happens, API costs have already spiked and the team has already wasted time on rework. By then, it is too late.
Without regular cleanup, memory bloats like a house that is never organized — the longer it goes, the slower, more expensive, and harder it becomes to find the information that actually matters.
Building a good AI Agent Team is not just about "creating many Bots" — it is about managing context, cost, and workflow to work together in harmony, just like a real team that needs rules, verification, and the right people assigned to the right tasks.
| File | Changes Made | Purpose |
|---|---|---|
config/openclaw.json | Added routing rules (2 rules) + session pruning config | Multi-Model Routing + context trimming |
docker-compose.yml | Changed restart policy for 4 containers + removed duplicate volume | Anti-Loop + bug fix |
CLAUDE.md | Added sections 9 (Anti-Loop), 10 (Memory), 11 (Auto-Verify) | Rules preventing all 4 issues |
Three additional improvements are planned to take the system even further.
| Priority | Planned Work | Difficulty | Impact |
|---|---|---|---|
| 1 | Summarizer Agent — n8n workflow to auto-summarize memory every night | Medium | Token use down 15-25% |
| 2 | Auto-Test Pipeline — n8n workflow to test cards before actual delivery | Medium | Zero-defect delivery |
| 3 | Sub-Agent Spawning — Enabling Agents to handle multiple tasks in parallel | Hard | Throughput up 30-50% |
Gemini Flash at $0.10/M tokens is the most affordable option at this quality tier. It responds quickly with low latency, making it ideal for routine tasks like notifications and data summaries that do not require complex reasoning. If quality falls short, the task gets routed to Sonnet 4 instead.
No — on-failure:5 means the container can restart up to 5 times only when it fails. Normal operation is unaffected. It only stops when there are 5 consecutive failures, which indicates a real bug that needs fixing.
Yes — every time a Bot starts a task, the entire memory is loaded into the context window (consuming roughly 2,000-3,000 tokens). Summarizing down to 200 lines retains all critical information while using 3x fewer tokens.
76% of skills (15 out of 25) were switched from Sonnet 4 ($3.00/M) to Flash ($0.10/M) = 30x cheaper for that portion. Averaging across the entire system (including the 24% still on Sonnet 4) yields approximately 40-60% savings depending on request volume per skill.
Last updated: March 2026
#API #Chatbot #Server #TeamWork #Technology #AITools #PromptEngineering #Automation #n8n #Deployment #CursorAI #Docker
สิ่งที่ได้ และหลักคิด
ของจริงที่เอาไปใช้ต่อได้ ไม่ใช่แค่ไอเดีย หลักคิดของเราคือทำให้เป็นระบบที่ทำซ้ำได้และไม่พึ่งความจำคน
อยากเห็นระบบแบบนี้ทำงานกับงานของคุณ — ดู ViberQC และลงชื่อรอรอบทดลองที่ hilogiclabs.com
Join Hi Logic Labs for premium Content, Templates and a quality Community
Sign UpWoke up to alerts flooding 3 channels — server overload, 5 broken workflows, 20 containers fighting for resources. AI diagnosed, analyzed, and fixed everything in 8 minutes without writing a single line of code.

Built an AI Trading Agent pulling real-time stock data from 4 markets. At 80% completion, discovered Node.js couldn't handle 1,000+ concurrent users — because the AI prompt didn't mention scale requirements. AI proposed Hybrid Architecture (Node.js + Python) completed in 30 minutes.
I built hilogiclabs.com entirely with AI — 30+ pages, 40+ APIs, 14 database tables. This article opens up the full architecture with Interactive Diagrams.