We scaled BikroyBuddy, an AI social-commerce agent, to thousands of active merchants — here's the architecture, the challenges, and the real metrics.
We scaled BikroyBuddy, an AI social-commerce agent for Bangladesh e-commerce, to thousands of active merchants processing tens of thousands of messages per hour at peak. Here's the architecture we built, the problems we solved, and what the real numbers look like — no fabricated metrics.
Bangladesh small businesses run e-commerce over Facebook, WhatsApp, and Instagram — not traditional web storefronts. A typical merchant with 500 followers might receive 100–300 customer messages per day: product availability queries, price negotiation, order placement, delivery status. Handling this manually is a full-time job. Handling it wrong (slow response, wrong price quoted) loses the sale — buyers move to the next seller instantly.
BikroyBuddy automates the entire customer interaction layer: understands customer queries (in Bangla, English, or code-switched Banglish), responds accurately, captures order details, generates invoices, updates inventory, and escalates edge cases to the merchant.
Message ingestion: Webhooks from Facebook Graph API, WhatsApp Business API, and Instagram Graph API feed into SQS. Message deduplication handled at the SQS level (same customer on multiple channels creates a unified conversation thread, not duplicate orders).
NLU layer: A fine-tuned LLM for Banglish intent classification (order intent, price query, complaint, general inquiry) — fine-tuned on 50K+ real Bangladesh e-commerce conversations. Claude handles complex conversations and edge cases via fallback.
CRM write-back: Confirmed orders write to a PostgreSQL CRM (customer record, order record, inventory decrement) within 2 seconds of message confirmation. Idempotent writes prevent double-orders from duplicate messages.
Peak load handling: ECS Fargate auto-scaling based on SQS queue depth. During Eid peak (10× normal load), scales from 2 to 20 tasks automatically in under 3 minutes. Message processing SLA: <5 seconds end-to-end at all load levels.
The hardest technical problem was Banglish. "bhai, eta ki available? price ta bolo" (bro, is this available? tell me the price) mixes Bangla words, English words, and informal abbreviations. Generic LLMs trained on clean corpora fail at this. We built a fine-tuning dataset from real merchant conversations (with consent), covering: product queries, price negotiation patterns, order placement phrases, and complaint language. The fine-tuned model achieves 94% intent classification accuracy on held-out test data.
Written by Shihab Shahriar Antor — AI Engineer & Founder of Shahriar Labs. Building LetX, QuantumSketch, and open-source AI agent skills.