Designing a URL Shortener on AWS: From Zero to Production
A complete walkthrough of designing a production-ready URL shortener on AWS — covering hashing strategies, database selection, caching, and scaling to billions of redirects.
Building a URL shortener is one of the most popular system design interview questions — and for good reason. It touches almost every fundamental: hashing, database choice, caching, rate limiting, analytics, and horizontal scaling.
In this post, we'll design a URL shortener that can handle 10,000 requests per second and store 10 billion URLs, deployed entirely on AWS.
Requirements
Functional
- Given a long URL, return a short URL (7-character code)
- Given a short URL, redirect to the original long URL
- Support custom aliases (e.g.
go.co/launch) - Track click analytics (count, location, device)
- URLs expire after a configurable TTL
Non-Functional
- Read-heavy: 100:1 read/write ratio
- 99.99% availability for redirects
<50msp99 redirect latency- Globally distributed (optimised for India + US)
Capacity Estimation
Let's work through the numbers:
Write rate: 100 URLs/sec → 8.64M URLs/day
Read rate: 10,000 redirects/sec
Storage per URL: ~500 bytes (URL + metadata)
Storage 5 years: 8.64M × 365 × 5 × 500B ≈ 7.9TB
Cache memory: 80% traffic hits 20% URLs (Pareto)
10K RPS × 0.2 × 500B ≈ 1GB hot data
URL Shortening Strategy
The most critical decision is how we generate the 7-character short code.
Option 1: MD5/SHA256 Hash
Hash the long URL, take the first 7 characters:
import hashlib
import base64
def shorten(long_url: str) -> str:
digest = hashlib.md5(long_url.encode()).digest()
encoded = base64.urlsafe_b64encode(digest).decode()
return encoded[:7]
Problem: Hash collisions become likely at scale. With a 7-char Base62 code (62^7 = 3.5 trillion combinations), collisions appear around the birthday paradox boundary (~56K URLs).
Option 2: Counter-Based (Our Choice)
Use a global counter and convert to Base62:
public class Base62Encoder {
private static final String CHARS =
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
public static String encode(long id) {
StringBuilder sb = new StringBuilder();
while (id > 0) {
sb.insert(0, CHARS.charAt((int)(id % 62)));
id /= 62;
}
// Pad to 7 characters
while (sb.length() < 7) sb.insert(0, 'a');
return sb.toString();
}
}
For distributed counter generation without a single point of failure, we use range-based allocation: each application server is pre-allocated a range of IDs (e.g., 1–10M, 10M–20M) from a Redis counter. When a server exhausts its range, it claims the next.
Architecture on AWS
┌─────────────────────────────────────────────────────────┐
│ CloudFront (CDN) │
│ Asia Pacific + US edge nodes │
└────────────────────────┬────────────────────────────────┘
│
┌────────────────────────▼────────────────────────────────┐
│ API Gateway (REST) │
│ Rate limiting: 10K req/sec per API key │
└──────────────┬─────────────────────┬────────────────────┘
│ │
┌──────────▼──────┐ ┌─────────▼─────────┐
│ Write Service │ │ Redirect Service │
│ (ECS Fargate) │ │ (Lambda@Edge) │
└──────────┬──────┘ └─────────┬──────────┘
│ │
┌──────────▼──────┐ ┌─────────▼─────────┐
│ DynamoDB │ │ ElastiCache │
│ (Primary store)│◄──│ Redis (hot URLs) │
└─────────────────┘ └────────────────────┘
Why DynamoDB?
- Key-value access pattern: We always look up by short code (partition key)
- Auto-scaling: Handles 10K+ reads/sec without manual intervention
- TTL support: Built-in expiration via the
ttlattribute - Global Tables: Multi-region replication for
<50mslatency worldwide
// DynamoDB item schema
interface ShortUrl {
pk: string; // short code (partition key)
longUrl: string; // original URL
userId: string; // owner
clickCount: number; // atomic counter
ttl: number; // Unix timestamp for expiry
createdAt: string; // ISO 8601
customAlias: boolean; // was it a custom alias?
}
Redirect Service with Lambda@Edge
For the hot redirect path, we use Lambda@Edge — it runs at CloudFront edge nodes, meaning the redirect logic executes within ~10ms of the user:
// Lambda@Edge Viewer Request handler
exports.handler = async (event) => {
const request = event.Records[0].cf.request;
const shortCode = request.uri.slice(1); // remove leading /
// 1. Check CloudFront cache (via custom header or KV)
// 2. If miss, call DynamoDB (via VPC endpoint or public endpoint)
// 3. Return 301 redirect
const item = await dynamodb.getItem({
TableName: 'short-urls',
Key: { pk: { S: shortCode } },
ConsistentRead: false, // eventual consistency is fine for redirects
}).promise();
if (!item.Item) {
return { status: '404', body: 'URL not found' };
}
return {
status: '301',
headers: {
location: [{ key: 'Location', value: item.Item.longUrl.S }],
'cache-control': [{ key: 'Cache-Control', value: 'max-age=86400' }],
},
};
};
Caching Strategy
We use a read-through cache with ElastiCache Redis:
@Service
public class RedirectService {
@Autowired
private RedisTemplate<String, String> redis;
@Autowired
private DynamoDbClient dynamoDb;
public String resolve(String shortCode) {
// 1. Check Redis
String cached = redis.opsForValue().get("url:" + shortCode);
if (cached != null) return cached;
// 2. DynamoDB fallback
String longUrl = fetchFromDynamo(shortCode);
if (longUrl == null) throw new NotFoundException();
// 3. Cache for 1 hour
redis.opsForValue().set("url:" + shortCode, longUrl,
Duration.ofHours(1));
return longUrl;
}
}
Cache eviction: LRU policy on Redis. The 1GB cache holds ~2M URLs. Given our Pareto distribution (20% URLs = 80% traffic), this covers the hot tier comfortably.
Analytics
We use an async write pattern to avoid blocking the redirect:
- Redirect returns immediately
- A Kinesis Data Stream event is published with
{shortCode, timestamp, userAgent, ip} - Lambda consumer processes the stream and writes to DynamoDB with
ADD 1atomic increment - Aggregated stats are pre-computed and cached in Redis
Rate Limiting
// Sliding window rate limiter using Redis ZADD
public boolean isAllowed(String userId, int limit, int windowSecs) {
long now = System.currentTimeMillis();
long windowStart = now - (windowSecs * 1000L);
String key = "rl:" + userId;
// Remove expired entries
redis.opsForZSet().removeRangeByScore(key, 0, windowStart);
// Count current window
Long count = redis.opsForZSet().count(key, windowStart, now);
if (count != null && count >= limit) return false;
// Add current request
redis.opsForZSet().add(key, String.valueOf(now), now);
redis.expire(key, windowSecs, TimeUnit.SECONDS);
return true;
}
Handling Failures
DynamoDB Throttling
- Use exponential backoff with jitter
- Enable auto-scaling on both read and write capacity
- Consider DAX (DynamoDB Accelerator) for microsecond reads if needed
Redis Failure
- Fall back to DynamoDB directly (slower, but correct)
- Redis Sentinel or ElastiCache Multi-AZ for HA
Cascading Failure Prevention
- Circuit breaker on DynamoDB calls
- Timeouts: 100ms on cache reads, 300ms on DynamoDB reads
- Fallback: return 503 if both fail (better than hanging forever)
AWS Cost Estimate (Monthly, India traffic)
| Service | Usage | Cost |
|---|---|---|
| API Gateway | 25M requests | ₹1,800 |
| Lambda@Edge | 250M invocations | ₹3,600 |
| DynamoDB | 10GB + 25M reads | ₹4,500 |
| ElastiCache (r6g.large) | 1 node | ₹7,200 |
| CloudFront | 5TB transfer | ₹3,000 |
| Total | ~₹20,100/month |
Summary
| Decision | Choice | Reason |
|---|---|---|
| Short code generation | Counter + Base62 | No collisions, predictable |
| Primary database | DynamoDB | Key-value pattern, auto-scale |
| Cache | ElastiCache Redis | Sub-millisecond hot redirects |
| Redirect execution | Lambda@Edge | <10ms global latency |
| Analytics | Kinesis + async writes | Non-blocking, scalable |
This architecture handles 10K+ redirects/second with a p99 latency under 50ms, costs roughly ₹20K/month, and scales horizontally by adding more ECS tasks or Lambda concurrency.
If you want to discuss the architecture for your specific use case, book a consultation.
Ravi Kant Shukla
Backend architect helping developers and startups build production-grade systems on AWS. 8+ years of experience in system design, microservices, and AI/ML deployment.
Enjoyed this post?
Get more system design and AWS insights delivered weekly. No spam.