Building a message queue around Telegram's rate limits

Published on June 5, 2026

Written by Mihajlo

Every product I ship needs a way for users to reach me. Not a ticketing system. Not a $50/seat dashboard. Just messages, in Telegram, where I already am. That’s what Bubblegram does. Here’s the engineering behind it.

The queue

Every message from the widget goes into a BullMQ queue backed by Redis before anything touches the Telegram API.

import { Queue, Worker } from 'bullmq'
import { redis } from './redis'

export const telegramQueue = new Queue('telegram', {
  connection: redis,
  defaultJobOptions: {
    attempts: 5,
    backoff: { type: 'custom' },
    removeOnComplete: { count: 100 },
    removeOnFail: { count: 500 },
  },
})

One queue. One worker. Concurrency of 1. That last part is deliberate. Telegram rate limits group messages to 20 per minute, and running parallel workers burns through that instantly. Sequential processing is a feature here, not a bottleneck.

The custom backoff is where it gets practical. When Telegram returns a 429, the response body includes a retry_after value: exactly how many seconds to wait. We use it directly instead of guessing:

const worker = new Worker(
  'telegram',
  async (job) => {
    await sendToTelegram(job.data)
  },
  {
    connection: redis,
    concurrency: 1,
    settings: {
      backoffStrategy: (attemptsMade, _type, err) => {
        if (err?.response?.data?.retry_after) {
          return err.response.data.retry_after * 1000
        }
        // Exponential, capped at 60s
        return Math.min(1000 * 2 ** attemptsMade, 60_000)
      },
    },
  }
)

Fixed backoff wastes retries and extends your outage window. Telegram tells you exactly how long to wait. Use it.

Topic routing

Every widget session maps to a Telegram forum topic. One session, one thread, permanently. The mapping lives in Postgres:

-- sessions table (simplified)
id          text primary key,
project_id  text not null,
topic_id    bigint,
email       text,
created_at  timestamptz default now()

When a user sends their first message, we create a forum topic in Telegram named after their email (or Anon_abc123 for anonymous users), store the topic_id, and every subsequent message from that session targets the same thread. The operator sees a tidy per-user inbox inside their Telegram group.

async function getOrCreateTopic(session: Session, projectId: string): Promise<number> {
  if (session.topicId) return session.topicId

  const chatId = await getProjectChatId(projectId)
  const topic = await telegram.createForumTopic(chatId, {
    name: session.email ?? `Anon_${session.id.slice(0, 6)}`,
  })

  await db
    .update(sessions)
    .set({ topicId: topic.message_thread_id })
    .where(eq(sessions.id, session.id))

  return topic.message_thread_id
}

This works. For most apps at reasonable scale, this is perfectly fine and you can stop here.

But if two messages arrive simultaneously for a brand new session (slow connection, double submit), both will find topicId as null and both will attempt to create a topic. You get a race condition. If you want to be bulletproof, make the write atomic at the database layer. Create the topic, then use UPDATE ... WHERE topicId IS NULL so only the first request wins:

async function claimSessionTopic(sessionId: string, topicId: number): Promise<boolean> {
  // Ensure the row exists (no-op if it already does)
  await db.insert(sessions).values({ sessionId }).onConflictDoNothing()

  // Atomic claim: only succeeds if no topic was set yet
  const [updated] = await db
    .update(sessions)
    .set({ topicId })
    .where(and(eq(sessions.sessionId, sessionId), isNull(sessions.topicId)))
    .returning({ sessionId: sessions.sessionId })

  return !!updated
}

If claimSessionTopic returns false, another request already won the race. You read the existing topicId and continue. The worst outcome is an orphaned Telegram topic, not a split conversation.

Topics also occasionally vanish. Operators delete them, Telegram glitches. The worker catches the “thread not found” error, recreates the topic, updates the session, and retries the job automatically. No manual intervention needed.

The return path

When an operator replies in Telegram, we receive a webhook POST. Telegram signs every webhook with a secret token you set during registration. We validate this on every request, no exceptions. If the signature is missing or wrong, the request is dropped silently. Without this, anyone who discovers your webhook URL can inject fake replies into user conversations.

The payload includes the thread ID. We look up which session owns that topic, find their active WebSocket connection, and push the reply through. If they’re offline, we buffer it and flush when they reconnect.

Two in-memory maps handle this: one for active WebSocket connections, one for pending replies. Nothing fancy.

Rate limiting

We use a token bucket. No library, around 40 lines. A bucket holds a token count and a last-refill timestamp. Each request consumes one token. Tokens refill at a steady rate. Empty bucket returns 429 with a Retry-After header.

interface Bucket {
  tokens: number
  lastRefill: number
}

function consume(bucket: Bucket, capacity: number, refillRate: number): boolean {
  const now = Date.now()
  const elapsed = (now - bucket.lastRefill) / 1000
  bucket.tokens = Math.min(capacity, bucket.tokens + elapsed * refillRate)
  bucket.lastRefill = now

  if (bucket.tokens < 1) return false
  bucket.tokens -= 1
  return true
}

Four policies, each tuned for a different constraint:

Per-session: 8 burst, 1 token per 8s. Stops a single user from spamming.
Per-project: 20 burst, 1 token per 4s. The critical one. This is what keeps us inside Telegram’s 20/min group limit across all concurrent users on a project.
Magic link auth: 3 burst, 1 per 5 minutes. You have no reason to authenticate more than that.
Dashboard API: 30 burst, 1 per 2s. Generous but bounded.

Every bucket key is scoped by identity, not IP. CDNs and corporate proxies share IPs across thousands of people. Keying by IP punishes the wrong users. Widget traffic uses apiKey + sessionId, dashboard uses userId.

Stale buckets clean up every 5 minutes. Anything inactive for 10 minutes gets dropped.

Takeaways

If you’re building on top of a rate-limited third-party API, a few things worth stealing from this setup:

Match your throughput to the constraint. Sequential queue processing sounds like a bottleneck until you realize Telegram’s rate limit is the actual bottleneck. Concurrency of 1 isn’t a limitation here. It’s the design.
Use the API’s own backoff values. When Telegram returns retry_after: 12, wait 12 seconds. Don’t guess with exponential backoff when the service tells you exactly how long.
Write your own rate limiter if you need multiple policies. Most libraries assume one policy, one key, one middleware. We needed four different policies on different endpoints with different key strategies. 40 lines, no dependencies, full control.
Scope rate limits by identity, not IP. If your traffic goes through a CDN or corporate proxy, IP-based limits punish the wrong users.
Lazy init is fine until it writes. If two requests race through a check-then-write path, you get duplicates. An atomic UPDATE ... WHERE field IS NULL is a one-line fix that makes it safe.

Too small for Intercom.
Too busy for DMs.

We built Bubblegram to fit into your existing workflow, not the other way around.

Start for free