Rate limiting in Bun.js: in-memory, Redis, sliding window, and production API edge cases

A practical deep dive into rate limiting middleware in Bun.js: fixed window, sliding window, token bucket, Redis, distributed limits, 429 Retry-After, abuse protection, Hono/Elysia integrations, best practices, and bad practices.

14 May 2026· 15 min read· Technology

Design abuse protection

Best forBackend engineersFull-stack developersTech leadsTeams building public APIs, SaaS, or webhook endpoints on Bun

Technical illustration of a rate limiting pipeline in Bun.js with counters, Redis, and controlled request flow

Guide / SeriesSeries article

Bun.js Middleware Production Guide 2026

A series about production middleware in Bun.js: overview, security, performance, observability, rate limiting, body parsing, WebSocket/SSE, and request pipeline testing.

All articles in this guide

Bun.js middleware in 2026: overview, best practices, and anti-patterns

The base mental model for middleware in native Bun, Hono, and Elysia, with examples, optimization, and a roadmap for the next deep dive articles.

Published

Auth middleware in Bun.js: JWT, sessions, API keys, and multi-tenant context

How to build auth middleware in Bun correctly: check order, cache, token rotation, tenant context, 401/403 errors, and testing.

Published

Rate limiting in Bun.js: in-memory, Redis, sliding window, and edge cases

A detailed breakdown of rate limiting for Bun APIs: algorithms, Redis, distributed limits, abuse protection, and graceful degradation.

You are here

Observability middleware in Bun.js: logs, request id, tracing, and latency budgets

How to add request id, structured logs, timing headers, an OpenTelemetry-like flow, and avoid turning logging into a bottleneck.

Published

Body parsing and validation in Bun.js: JSON, uploads, streams, and payload limits

How to safely read request bodies in Bun, where to place limits, and how not to break streams, uploads, idempotency, and schema validation.

Published

Previous Next

hook

A bad rate limiter does not stop the attack. It stops your customers

A naive limit like “100 requests per IP per minute” looks fine until the first office behind NAT, the first mobile carrier, the first webhook retry storm, or the first enterprise tenant with hundreds of users behind one egress IP.

Rate limiting has to answer more than “how many requests?”. It must answer: “who exactly is being limited?”, “which route?”, “which tenant?”, “which credential?”, “what happens when Redis is down?”, “is there a Retry-After?”, and “are we breaking a legitimate burst?”.

Bun makes the HTTP layer fast, but rate limiting always depends on key design, storage atomicity, and failure policy. That is what we break down here.

An in-memory limiter is suitable only for a single process or a local baseline.

Redis is needed when the API has multiple instances or horizontal scaling.

Sliding window and token bucket solve different problems.

429 without Retry-After makes life harder for good clients.

map

What we cover

This is the third chapter in the Bun middleware series. After the overview and auth, the next logical step is abuse protection: rate limiting is often what stands between your API and an expensive wave of unnecessary requests.

The mental model of rate limiting middleware: identify key, choose policy, check counter, allow or reject.

Fixed window, sliding window, and token bucket: where each algorithm works well and where it creates edge cases. [5][6]

A native Bun in-memory limiter example for a single process.

Redis-backed limiter for distributed APIs and why atomicity matters. [5][6]

Hono/Elysia options: when to use ready-made middleware/plugin. [1][2]

HTTP 429 Too Many Requests and Retry-After: what the client should receive. [7]

Bad practices: IP-only limits, global limits for all routes, fail-open without alerts, plaintext API key in limiter key.

The mental model of rate limiting middleware

A rate limiter should be a small state machine, not a random check inside the handler. Once you split it into steps, most mistakes become visible before the code exists.

Define the identity key

This can be user id, API key id, tenant id, route group, IP, or a combination. For an authenticated API key, apiKeyId + routeGroup is usually better than plain IP.

Choose the policy

Different routes need different limits: login, search, export, webhook, public read, admin mutation. One global limit is almost always either too weak or too aggressive.

Atomically update the counter

In one process, that can be a Map. In a distributed API, that is Redis or another shared store. With Redis, increment + expiry or sliding-window update must be atomic.

Return a useful `429`

The client should receive a stable JSON error, Retry-After, and preferably rate-limit headers. Otherwise, good clients do not know when to retry.

Log without secrets

Log limiter key hash/prefix, route group, tenant, decision, remaining, and reset time. Do not log the full API key or Authorization header.

Summary

If you do not have a clear answer for each of these steps, the limiter is not production-ready yet.

Fixed window, sliding window, token bucket: what to choose

The algorithm defines not only accuracy, but also UX. Two clients may have the same number of requests per minute, but one creates a burst at the window boundary while the other is evenly distributed.

Algorithm	How it works	When it fits	Weak point
Fixed window	Counter for a fixed window, for example 100 requests per minute	Simple internal endpoints, low-risk APIs, cheap baseline	Boundary burst: a client can make many requests across two window edges
Sliding window log	Stores request timestamps and counts only those inside the moving window	Critical APIs, login, checkout, expensive operations	More storage and cleanup work per request
Sliding window counter	Approximates a moving window through the current and previous bucket window	A balance of accuracy and cost for high-traffic APIs	Less accurate than the log variant and needs careful reset math
Token bucket	The client has a bucket of tokens that refills over time	APIs where a short burst is fine but average rate must be controlled	Capacity and refill rate must be chosen correctly

Fixed window is simple, sliding window is more accurate, and token bucket handles legitimate bursts better.

Summary

Start with fixed window for a simple baseline, but public expensive routes usually benefit from sliding window or token bucket.

In-memory limiter in native Bun: useful baseline, not distributed protection

For a single Bun process, you can write a simple in-memory fixed-window limiter. It is useful for local dev, internal tools, single-instance deployments, or as a fallback, but it does not synchronize across instances.

A minimal example:

type LimitEntry = { count: number; resetAt: number };
const limits = new Map<string, LimitEntry>();

const WINDOW_MS = 60_000;
const MAX_REQUESTS = 120;

function rateLimitKey(req: Request) {
  const apiKeyId = req.headers.get("x-api-key-id");
  const ip = req.headers.get("x-forwarded-for")?.split(",")[0]?.trim() ?? "unknown";
  return apiKeyId ? `api:${apiKeyId}` : `ip:${ip}`;
}

function checkLimit(key: string, now = Date.now()) {
  const current = limits.get(key);

  if (!current || current.resetAt <= now) {
    limits.set(key, { count: 1, resetAt: now + WINDOW_MS });
    return { allowed: true, remaining: MAX_REQUESTS - 1, resetAt: now + WINDOW_MS };
  }

  if (current.count >= MAX_REQUESTS) {
    return { allowed: false, remaining: 0, resetAt: current.resetAt };
  }

  current.count += 1;
  return { allowed: true, remaining: MAX_REQUESTS - current.count, resetAt: current.resetAt };
}

Bun.serve({
  async fetch(req) {
    const decision = checkLimit(rateLimitKey(req));

    if (!decision.allowed) {
      const retryAfter = Math.ceil((decision.resetAt - Date.now()) / 1000);
      return Response.json(
        { error: "rate_limited", retryAfter },
        { status: 429, headers: { "Retry-After": String(retryAfter) } },
      );
    }

    return Response.json({ ok: true, remaining: decision.remaining });
  },
});

This is a fixed-window baseline. It does not actively clean old keys, does not work across multiple processes, does not have tenant-specific policies, and does not protect against boundary bursts. But it shows the right shape: key, counter, decision, 429, Retry-After.

Where it fits

An in-memory limiter is fine for one process or as a cheap local guard. A production API with autoscaling needs a shared store.

Redis-backed rate limiting: when the API has more than one instance

Once a Bun API runs on multiple instances, in-memory counters stop being a global limit. One client can distribute requests across instances and get a multiplier on its allowance. Redis solves this as a shared counter store.

The critical detail: counter update must be atomic. For fixed window, this is often INCR + EXPIRE, but expiry must be guaranteed to be set correctly on the first increment. For sliding window log, sorted sets and cleanup of old timestamps are common. For token bucket, a Lua script or another atomic mechanism is often needed so refill and consume happen in one operation. Redis rate limiting patterns usually rely on atomic counters or Lua. [5][6]

Redis also adds a failure mode: what happens when it is unavailable. For public expensive routes, fail closed or degraded limiting is often safer. For a critical internal control plane, the policy may differ. But any fail-open mode needs alerts, because otherwise the rate limiter disappears exactly when it is needed most.

For multiple Bun instances, the limit must live in a shared store. Otherwise, each instance gives the client a separate allowance.

Practical baseline

A distributed limiter needs a shared store, atomic update, key design, TTL cleanup, latency budget, and fail policy.

Hono and Elysia: when to use a ready-made limiter

Ready-made middleware or a plugin is useful when the task is typical. But it does not decide key strategy, tenant policy, or distributed storage for you.

The Hono ecosystem has rate limiter middleware with configurable window, limit, key generator, and store options. This is good for a quick baseline in a Hono app. [1]

Hono rate limiter middleware

The Elysia ecosystem has a rate-limit plugin for Bun-first Elysia apps. It fits naturally into the Elysia lifecycle/plugins model. [2]

Elysia rate-limit plugin

OWASP API Security Top 10 identifies unrestricted resource consumption as a separate risk. Rate limiting should protect CPU, memory, storage, network, and downstream resources. [3]

OWASP API Security

Summary

A ready-made limiter reduces boilerplate. Production quality depends on keys, storage, policies, observability, and edge cases.

keys

Key strategy: IP-only is almost never enough

The rate limit key is the main decision. If the key is wrong, the algorithm will not save you. IP-only limits often punish normal users behind NAT and fail to catch authenticated abuse.

For public anonymous routes, IP can be a starting key. For authenticated APIs, it is better to limit by subject id, API key id, tenant id, or a combination like tenantId + routeGroup. For login flows, you may need IP limit, account/email limit, and device fingerprint policy at the same time.

For multi-tenant SaaS, a user-only limit can be too soft because one tenant with many users can overload a resource. A tenant-only limit can be too strict because one noisy user blocks the whole company. Often you need a hierarchy: per-user, per-tenant, per-route, and global emergency limit.

Public read: IP + route group.

API key: key id + route group + tenant.

SaaS tenant: tenant budget + per-user budget.

Expensive exports/search: a separate narrow route limit.

`429` and `Retry-After`: help good clients behave correctly

HTTP 429 Too Many Requests means the user has sent too many requests in a given time period. MDN notes that the response may include Retry-After, which tells the client how long to wait before retrying. [7]

In a production API, 429 without Retry-After forces good clients to guess. They either retry too quickly or use exponential backoff where they could simply wait until reset. This hurts UX and increases unnecessary load.

Beyond Retry-After, many APIs add rate-limit headers such as RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset, or custom X-RateLimit-*. The important part is a stable contract and documentation for clients.

429 should be useful: the client should know when to retry instead of guessing backoff.

Summary

A rate limiter should be understandable for the client. 429 without a retry contract creates repeated traffic and poor integration.

Edge cases to solve before production

Rate limiting has many unpleasant details that are invisible on the happy path. These are the ones that most often appear after launch.

NAT and corporate networks

An IP-only limit can block dozens of normal users behind one egress IP. For authenticated APIs, key by subject/API key/tenant instead.

Webhook retry storms

A partner may honestly retry failed webhooks and hit the limiter. Webhooks need separate policies, idempotency, and a retry-aware contract.

Clock skew

If the limiter is spread across systems, reset time and sliding windows must be calculated consistently. Redis server time or a centralized store is often more reliable than local clocks.

Burst after deploy

After downtime or deploy, clients may synchronously retry requests. Token bucket or queued backoff may be better than a hard fixed window.

Admin and internal routes

Do not give internal tools unlimited access by default. They often run the heaviest exports and batch operations.

Redis failure

Fail-open without alerts makes protection invisible. Fail-closed without degradation can take the product down. Policy should differ by route class.

Bad rate limiting middleware practices in Bun

These mistakes are not unique to Bun, but in Bun APIs they are often hidden behind a fast runtime and a simple middleware wrapper.

One global limit for all routes.

IP-only limit for authenticated APIs.

In-memory limiter in multi-instance production.

Redis INCR without correct TTL or atomicity.

No Retry-After in the 429 response.

Limits do not account for tenant, API key, or route cost.

Full API key or Authorization header enters logs as the limiter key.

Fail-open during Redis outage without alerting.

Rate limiter runs after body parsing for expensive payload routes.

No tests for boundary burst, reset, Redis failure, and concurrent requests.

Review rule

The rate limiter should run early, have the right key, an atomic counter, a clear 429, and an observable decision.

Production checklist for Bun rate limiting

Before launching rate limiting in staging or production, go through this list. It helps find problems before customers do.

Route classes are defined

Public, auth, login, webhook, export, admin, and internal routes have different policies.

Key strategy is not IP-only

For authenticated routes, use subject/API key/tenant/route group, and keep IP as an additional signal.

Distributed store exists for multi-instance

If the Bun API has multiple instances, counters live in Redis or another shared store.

Operations are atomic

Increment, expiry, sliding window cleanup, or token consume happen without race conditions.

`429` has a retry contract

The response contains a stable JSON error and Retry-After; rate-limit headers are documented.

Limiter runs before expensive operations

Rate check happens before body parsing, DB calls, remote calls, and heavy transforms when the route allows it.

Redis failure policy is defined

For each route class, fail-open or fail-closed is known, and alerting exists.

Observability exists

Decision, route group, limiter key hash, remaining, reset time, storage latency, and Redis failures are logged.

Conclusion: rate limiting in Bun starts with key strategy, not Map or Redis

Bun gives you a fast HTTP runtime, but rate limiting is not a runtime feature you can add with one line and forget. It is a security and reliability policy that must know who it limits, for which route, with which storage, which algorithm, and which retry contract.

For one process, an in-memory fixed window can be a reasonable baseline. For production with multiple instances, you need Redis or another shared store. For user-facing APIs, fixed window is often too rough; sliding window or token bucket gives a better UX.

Most importantly: do not punish real users with a bad key. IP-only limits, one global limit, and 429 without Retry-After usually create more problems than they solve.

FAQ

Is an in-memory rate limiter enough in Bun.js?

Only for one process, local dev, internal tools, or a simple baseline. If the API has multiple instances, an in-memory limit is multiplied by the number of instances and is not global protection.

Which is better: fixed window, sliding window, or token bucket?

Fixed window is the simplest, but it has boundary bursts. Sliding window is more accurate for critical routes, but more expensive. Token bucket allows short legitimate bursts while controlling average rate.

Why is IP-only rate limiting bad?

An IP-only limit can block normal users behind NAT, a corporate proxy, or a mobile carrier, while still working poorly for authenticated abuse. For APIs, it is better to limit by subject, API key, tenant, and route group.

What should a Bun API return when the limit is exceeded?

A stable JSON error with HTTP `429`, the `Retry-After` header, and preferably rate-limit headers such as remaining/reset. This helps good clients retry correctly. [7]

Do I need Redis for rate limiting?

Not necessarily for one process. For production with multiple instances or serverless/concurrent deployment, Redis or another shared store is practically required so counters are shared.

What should happen if Redis is unavailable?

Define the fail policy in advance. For expensive public routes, fail closed or a degraded strict local limit is often safer. For some internal/control-plane routes, fail-open may be acceptable, but only with alerting and audit.

Sources

These sources confirm ready-made middleware/plugin options, the security rationale for resource limiting, Redis rate limiting patterns, and HTTP semantics for 429.

Reviewed: 14 May 2026Applies to: Bun 1.3.xApplies to: Bun.serve routesApplies to: Hono 4.xApplies to: Elysia 1.xApplies to: Redis-backed APIsTested with: Bun.serve middleware compositionTested with: Redis countersTested with: Hono rate limiter middlewareTested with: Elysia rate-limit pluginTested with: HTTP 429 Retry-After response

Need rate limiting for a Bun API without accidental blocks?

PAS7 Studio can help design rate limiting for Bun, Hono, or Elysia: route classes, Redis store, sliding window or token bucket, API-key/tenant budgets, abuse monitoring, and a correct 429 contract.

This is especially useful for SaaS, public APIs, webhook endpoints, AI/automation products, and migrations from Express/Fastify where old limits do not account for tenants, API keys, or horizontal scaling.

Discuss rate limiting Read the auth middleware chapter

You are here03/05

Rate limiting in Bun.js: in-memory, Redis, sliding window, and edge cases

Previous Next

You are here: 03/05

Rate limiting in Bun.js: in-memory, Redis, sliding window, and edge cases

Bun.js Middleware Production Guide 2026

A bad rate limiter does not stop the attack. It stops your customers

What we cover

The mental model of rate limiting middleware

Define the identity key

Choose the policy

Atomically update the counter

Return a useful 429

Log without secrets

Fixed window, sliding window, token bucket: what to choose

In-memory limiter in native Bun: useful baseline, not distributed protection

Redis-backed rate limiting: when the API has more than one instance

Hono and Elysia: when to use a ready-made limiter

Key strategy: IP-only is almost never enough

`429` and `Retry-After`: help good clients behave correctly

Edge cases to solve before production

NAT and corporate networks

Webhook retry storms

Clock skew

Burst after deploy

Admin and internal routes

Redis failure

Bad rate limiting middleware practices in Bun

Production checklist for Bun rate limiting

Route classes are defined

Key strategy is not IP-only

Distributed store exists for multi-instance

Operations are atomic

429 has a retry contract

Limiter runs before expensive operations

Redis failure policy is defined

Observability exists

Conclusion: rate limiting in Bun starts with key strategy, not Map or Redis

FAQ

Sources

Need rate limiting for a Bun API without accidental blocks?

Related Articles

AI Assistant Development Cost in 2026: RAG Chatbots, CRM Integrations, Guardrails, and Support

AI for landing page development: where it speeds up launches and where it hurts conversion

AI SEO / GEO in 2026: Your Next Customers Aren’t Humans — They’re Agents

The most powerful Apple chip yet? M5 Pro and M5 Max are breaking records

Professional development for your business

Return a useful `429`

`429` has a retry contract