30 System Design Interview Questions from FAANG (With Approach Hints)

Q: Q: Do FAANG companies actually ask these exact questions?

Most of them, yes -- or close variants. "Design a chat system" might become "Design Facebook Messenger" or "Design Slack," but the core problem is identical. Companies rotate their question pools, but the underlying patterns have been stable for years. Preparing for the pattern gives you more coverage than memorizing company-specific questions.

Here’s something nobody tells you about FAANG system design interviews: the questions repeat. Not word-for-word, but the underlying patterns cycle through a surprisingly small set. Google’s “Design YouTube” and Meta’s “Design Instagram Reels” are, at the architecture level, the same problem wearing different hats.

I figured this out the hard way — after prepping for about twenty unique-sounding questions and realizing I kept drawing the same three or four core architectures. The specifics change. The bones don’t. Once I started seeing the patterns instead of the product names, the whole thing clicked.

This list is the result of that realization. Thirty questions, grouped by difficulty, each with a short explanation of why it’s asked and a quick approach hint to get you started. These aren’t meant to be full solutions — if you want a structured process for working through them, check out the URDGE framework for system design interviews. This list is your problem set.

Beginner (Questions 1-10)

These are the “warm-up” tier. Most candidates at the mid-level will see at least one of these. They test fundamental concepts — read/write patterns, caching, basic scaling, API design. Don’t let the word “beginner” fool you; a shallow answer here still gets you rejected.

1. Design a URL shortener (like Bitly)

Why it’s asked: This is the “Hello World” of system design. It tests your understanding of hashing, key-value storage, read-heavy access patterns, and basic API design. Almost every FAANG company has used some version of this.

Key concepts: Hashing strategies (base62 vs MD5 truncation), key-value stores, read-through caching, 301 vs 302 redirects, collision handling.

Approach hint: Start with the API contract (create short URL, redirect). Estimate the scale — writes per second, read-to-write ratio. A key-value store like DynamoDB fits naturally. Add Redis caching for hot URLs since traffic follows Pareto distribution. The interesting part is your key generation strategy — discuss trade-offs between auto-increment IDs with base62 encoding versus hashing.

2. Design a paste tool (like Pastebin)

Why it’s asked: It’s a URL shortener with a twist — now you’re storing content, not just mappings. This tests your thinking about object storage, content size limits, and TTL-based expiration.

Key concepts: Object storage (S3), metadata vs content separation, TTL expiration, access control (public vs private pastes), rate limiting.

Approach hint: Separate metadata (title, expiry, author) in a database from the actual paste content in object storage. Generate unique keys similarly to a URL shortener. Add a cleanup service that expires old pastes. The wrinkle most candidates miss: what happens when a paste goes viral? CDN caching for public pastes solves that.

3. Design a rate limiter

Why it’s asked: It’s deceptively simple and tests your understanding of distributed coordination. Anyone can describe token bucket in theory; the challenge is making it work across multiple application servers.

Key concepts: Token bucket vs sliding window algorithms, Redis for distributed state, race conditions with concurrent requests, per-user vs per-IP vs per-endpoint limits.

Approach hint: Pick a specific algorithm — token bucket is easiest to explain. Use Redis with atomic operations (INCR + EXPIRE) to store counters. Discuss where the limiter lives (API gateway vs middleware). The real depth comes from edge cases: what about distributed rate limiting across multiple data centers? How do you handle bursts gracefully?

4. Design a key-value store

Why it’s asked: This question probes your database internals knowledge. It’s common at Google and Amazon, where you’re expected to understand what happens under the hood, not just use managed services.

Key concepts: LSM trees vs B-trees, write-ahead logs, compaction, replication (leader-follower vs leaderless), consistent hashing for partitioning, CAP theorem trade-offs.

Approach hint: Clarify the requirements first — is this optimized for reads or writes? For write-heavy, go with an LSM tree approach (memtable + SSTables + compaction). For distributed scenarios, use consistent hashing to partition data across nodes. Discuss replication strategy and what happens during node failure. The trade-off conversation (consistency vs availability) is where points are earned.

5. Design a notification system

Why it’s asked: It touches multiple delivery channels (push, SMS, email) and tests your ability to design reliable, asynchronous systems that degrade gracefully.

Key concepts: Message queues, fan-out patterns, delivery guarantees (at-least-once vs exactly-once), user preference management, rate limiting per channel, retry with exponential backoff.

Approach hint: Start with a notification service that accepts requests and routes them to channel-specific workers via message queues (one queue per channel). Use a preference store so users control what they receive. The key design decision is reliability: how do you ensure a notification is delivered at least once without spamming users? Idempotency keys and a delivery log solve this.

6. Design a unique ID generator (distributed)

Why it’s asked: Seemingly trivial, but it reveals whether you understand coordination overhead in distributed systems. Auto-increment doesn’t work across multiple servers, and this question explores why.

Key concepts: Snowflake IDs (timestamp + machine ID + sequence), UUID trade-offs (storage, sortability), clock synchronization issues, pre-allocated ID ranges.

Approach hint: Twitter’s Snowflake approach is the standard answer: 64-bit IDs composed of timestamp (41 bits), machine ID (10 bits), and sequence number (12 bits). This gives you sortable, unique IDs without coordination between servers. Discuss the clock drift problem and how NTP helps but doesn’t fully solve it. Compare against UUIDs — larger, not sortable, but zero coordination needed.

7. Design a task queue (like Celery or SQS)

Why it’s asked: It tests your understanding of asynchronous processing, which is fundamental to any system operating at scale. Amazon asks variations of this regularly.

Key concepts: At-least-once delivery, visibility timeouts, dead letter queues, priority queues, worker scaling, exactly-once processing challenges.

Approach hint: Core components: producers push tasks to a queue, consumers pull and process them. Use a visibility timeout — if a worker doesn’t acknowledge within N seconds, the task becomes available again. Add a dead letter queue for tasks that fail repeatedly. The interesting discussion is ordering guarantees: strict FIFO is expensive at scale, so most systems offer best-effort ordering. Name that trade-off.

8. Design an API rate limiter for a cloud service

Why it’s asked: This extends the basic rate limiter to a multi-tenant SaaS context. Now you’re dealing with different tiers, quotas, and the business logic around throttling paying customers differently.

Key concepts: Tiered rate limits, distributed counting, quota management, graceful degradation (429 responses with Retry-After headers), analytics for usage tracking.

Approach hint: Layer the limiting: global limits, per-tenant limits, per-endpoint limits. Store counters in Redis, keyed by tenant ID + endpoint + time window. Use sliding window counters for smoother behavior than fixed windows. The business-critical design decision: what happens when a premium customer hits their limit? Queue overflow requests briefly rather than hard-reject? That’s the kind of product-aware thinking interviewers love.

9. Design a content delivery network (CDN)

Why it’s asked: CDNs are invisible infrastructure that candidates often use without understanding. This question tests whether you grasp geographic distribution, caching hierarchies, and cache invalidation.

Key concepts: Edge servers, origin pull vs push, cache hierarchies (edge, regional, origin shield), TTL and cache invalidation, DNS-based routing, consistent hashing for cache nodes.

Approach hint: Start with the read path: user request hits DNS, gets routed to nearest edge server. Cache hit returns immediately; cache miss pulls from origin (possibly via a regional tier to reduce origin load). Discuss invalidation strategies — TTL-based is simple but stale, purge APIs are precise but complex. The depth move: how do you handle cache stampedes when a popular object expires simultaneously across edges?

10. Design a URL crawler (basic web crawler)

Why it’s asked: Crawling is a classic distributed systems problem. It requires coordination, politeness (not hammering sites), deduplication, and handling the messiness of the real web.

Key concepts: BFS vs DFS crawling strategy, URL frontier, politeness policies (robots.txt, per-domain rate limiting), URL deduplication (bloom filters), distributed task coordination.

Approach hint: The URL frontier is the core data structure — a priority queue of URLs to visit with per-domain politeness constraints. Workers pull URLs, fetch pages, extract links, and push new URLs back. Use bloom filters for deduplication (have we seen this URL before?). Discuss how to scale: partition the frontier by domain so each worker handles a subset. The nuance is handling traps — infinite pagination, dynamic URLs, crawler traps.

Intermediate (Questions 11-20)

These questions show up in senior engineer interviews. They involve multiple interacting subsystems, real-time requirements, or tricky consistency challenges. You need to demonstrate trade-off thinking, not just component assembly.

11. Design a chat system (like WhatsApp or Slack)

Why it’s asked: Real-time messaging combines persistent connections, message ordering, delivery guarantees, and presence detection. Meta and Amazon ask this frequently.

Key concepts: WebSockets for persistent connections, message ordering (per-conversation sequence numbers), read receipts, presence service (heartbeat-based), message storage (write-heavy, time-ordered), group chat fan-out.

Approach hint: Each user maintains a WebSocket connection to a chat server. A connection manager tracks which server each user is connected to. For 1:1 messages, route directly via the connection map. For group chat, fan out to all members’ servers. Store messages in a time-series optimized database (Cassandra works well). The hard part: what happens when a user is offline? Queue messages and deliver on reconnect. Discuss end-to-end encryption implications if prompted.

12. Design a news feed (like Facebook or Twitter)

Why it’s asked: The fan-out problem is central to social networks. This question tests your ability to reason about push vs pull trade-offs at massive scale.

Key concepts: Fan-out-on-write vs fan-out-on-read, celebrity problem (users with millions of followers), timeline caching, ranking algorithms, eventual consistency.

Approach hint: Hybrid approach: fan-out-on-write for normal users (pre-compute and push to followers’ timelines), fan-out-on-read for celebrities (fetch and merge at read time). Store timelines in Redis lists, capped at a few hundred entries. The ranking layer sits between raw timeline and user — it re-orders, deduplicates, and inserts ads. Discuss what “eventual consistency” means here: your friend’s post might take a few seconds to appear, and that’s acceptable.

13. Design Instagram

Why it’s asked: It’s a media-heavy social platform that combines content storage, feed generation, and real-time interactions. Tests your ability to handle large binary objects alongside structured data.

Key concepts: Image/video storage and CDN delivery, feed generation (similar to news feed), like/comment services, explore/discovery algorithms, image processing pipeline (resize, filter, thumbnail).

Approach hint: Separate the upload path from the read path. Uploads go to object storage (S3), trigger an async processing pipeline (resize to multiple dimensions, generate thumbnails), then update metadata in the database. Reads are served from CDN. The feed is a variant of the news feed problem — fan-out-on-write for most users. The explore feature adds a recommendation layer based on engagement signals. Discuss how you’d handle a user uploading a 50MB image on a flaky mobile connection (chunked uploads, resumability).

14. Design a search autocomplete system

Why it’s asked: It’s latency-critical (results must appear as you type) and data-intensive (billions of queries). Google asks this for obvious reasons.

Key concepts: Trie data structure, top-K frequent queries, pre-computation vs real-time ranking, data collection pipeline, personalization layer, serving from memory.

Approach hint: Pre-compute the most popular completions using a trie where each node stores the top N suggestions. Serve from memory for sub-10ms latency. A separate offline pipeline processes search logs, aggregates frequencies, and rebuilds the trie periodically. The nuance: how do you handle trending queries that spike suddenly? A real-time layer that detects frequency anomalies and injects trending completions into the pre-computed set.

15. Design a web crawler at scale (Google-scale)

Why it’s asked: This extends the basic crawler to billions of pages. Now you’re dealing with prioritization, freshness, duplicate content detection, and massive distributed coordination.

Key concepts: URL prioritization (PageRank-informed), content fingerprinting (simhash for near-duplicate detection), recrawl scheduling based on change frequency, distributed crawl frontier, DNS caching, politeness at scale.

Approach hint: Partition the URL space across crawler nodes by domain. Each node maintains a local priority queue. A central scheduler assigns domains and manages global priorities. Use simhash to detect near-duplicate content (different URLs, same article). Recrawl frequency should be adaptive — pages that change hourly get crawled hourly; static pages get crawled monthly. The depth discussion: how do you handle the politeness constraint without under-utilizing your crawler fleet?

16. Design a file storage system (like Google Drive or Dropbox)

Why it’s asked: It tests real-time sync, conflict resolution, chunking for large files, and offline support. It’s a favorite at Google and Meta.

Key concepts: File chunking and deduplication, sync protocols (operational transform or CRDT for conflicts), metadata service, block storage, versioning, client-side change detection (file system watchers).

Approach hint: Split files into fixed-size chunks and store them by content hash (enables deduplication across users). A metadata service tracks file structure, permissions, and chunk mappings. Sync uses a notification channel — when a file changes, the server notifies connected clients to pull updated chunks. Conflict resolution: for simultaneous edits, keep both versions and let the user resolve. The performance insight: only sync changed chunks, not entire files. A 1GB file with a one-byte change should transfer one chunk, not 1GB.

17. Design a video streaming platform (like YouTube or Netflix)

Why it’s asked: Video involves massive storage, transcoding pipelines, adaptive bitrate streaming, and global delivery. It tests your end-to-end system thinking.

Key concepts: Video transcoding pipeline (multiple resolutions/codecs), adaptive bitrate streaming (HLS/DASH), CDN for delivery, pre-signed URLs for access control, recommendation engine, view counting at scale.

Approach hint: Upload path: client uploads raw video to object storage, which triggers a transcoding pipeline producing multiple resolution/bitrate variants. Store metadata (title, thumbnails, formats) in a database. Playback: client requests a manifest file listing available qualities, then adaptively selects segments based on bandwidth. Serve video segments from CDN. The interesting design decision: how do you count views reliably at YouTube scale? Batch processing with deduplication, not real-time counters.

Why it’s asked: Real-time location matching, geospatial queries, and dynamic pricing create a rich design problem. Common at Amazon and Google.

Key concepts: Geospatial indexing (geohash, S2 cells), real-time location tracking, matching algorithm, ETA estimation, surge pricing, trip state machine, driver/rider communication.

Approach hint: Drivers continuously report location; store positions in a geospatial index (geohash grid). When a rider requests, query nearby drivers using the geohash, rank by distance and ETA, and dispatch. A trip service manages the state machine (requested, matched, en-route, in-progress, completed). The depth topic: surge pricing requires real-time demand/supply ratio per geographic area. Discuss how you’d avoid oscillation (prices spike, drivers flood in, prices crash, drivers leave).

19. Design a metrics monitoring system (like Datadog or Prometheus)

Why it’s asked: It tests your understanding of time-series data, high-write throughput, aggregation, alerting, and storage efficiency.

Key concepts: Time-series databases, write-ahead logs, downsampling for older data, push vs pull collection models, aggregation pipelines, alerting rules engine, dashboard query optimization.

Approach hint: Agents on each host push metrics to a collection service, which writes to a time-series database (partition by metric name + time range). Recent data at full resolution; older data downsampled (1-minute averages become 1-hour averages). Alerting evaluates rules against recent data and triggers notifications. The storage insight: time-series data compresses extremely well because consecutive values are often similar — use delta encoding or gorilla compression.

20. Design a collaborative document editor (like Google Docs)

Why it’s asked: Real-time collaboration is one of the hardest distributed systems problems in consumer tech. It tests conflict resolution, operational transforms, and real-time sync.

Key concepts: Operational Transform (OT) or CRDTs, cursor and presence awareness, WebSocket connections, document versioning, undo/redo in collaborative context, access control.

Approach hint: Each client maintains a local copy and sends operations (insert, delete) to the server. The server uses OT to transform concurrent operations so they converge to the same state. Broadcast transformed operations to all connected clients. Store the operation log for undo/redo and version history. The key trade-off: OT is proven but complex; CRDTs are simpler to reason about but can have larger state overhead. Pick one and justify it.

Advanced (Questions 21-30)

These are staff/senior-staff territory. They involve cross-system orchestration, complex consistency requirements, global-scale distribution, or domain-specific depth. Partial credit is expected — the goal is to see how far your reasoning extends.

21. Design a distributed message queue (like Kafka)

Why it’s asked: It tests deep understanding of distributed systems fundamentals: replication, ordering, partitioning, consumer groups, and exactly-once semantics.

Key concepts: Topic partitioning, consumer groups, offset management, replication (ISR — in-sync replicas), log compaction, exactly-once delivery (idempotent producers + transactional writes), zero-copy transfer.

Approach hint: Topics are split into partitions, each an append-only log stored on a broker. Producers write to partition leaders; replicas in the ISR set replicate synchronously. Consumers in a group each own a subset of partitions (rebalanced on join/leave). Offsets track consumption progress. The advanced discussion: how do you achieve exactly-once? Idempotent producers (sequence numbers per partition) plus transactional writes that atomically commit offsets and output records.

22. Design a distributed cache (like Memcached/Redis cluster)

Why it’s asked: Caching seems simple until you distribute it. This tests consistent hashing, cache invalidation strategies, thundering herd protection, and hot key handling.

Key concepts: Consistent hashing with virtual nodes, cache-aside vs read-through vs write-through patterns, thundering herd (lock-based vs probabilistic early expiration), hot key replication, cache warming, eviction policies.

Approach hint: Use consistent hashing to distribute keys across cache nodes with virtual nodes for balance. For reads: cache-aside pattern (app checks cache, falls back to database, populates cache). Thundering herd protection: when a popular key expires, use a distributed lock so only one request rebuilds it while others wait. Hot keys: replicate to multiple nodes and load-balance reads across them. Discuss the invalidation nightmare: what strategy ensures cache consistency with the database?

23. Design a payment system (like Stripe)

Why it’s asked: Financial systems demand extreme correctness. This tests idempotency, exactly-once processing, distributed transactions, and auditability. Common at Amazon and fintech-adjacent roles.

Key concepts: Idempotency keys, distributed transactions (saga pattern), double-entry bookkeeping, PCI compliance, payment state machine, webhook delivery, reconciliation, retry safety.

Approach hint: Every payment request carries an idempotency key to prevent double charges. Use a state machine: created, processing, succeeded, failed. The payment service coordinates between merchant verification, fraud detection, and payment processor integration via the saga pattern (each step has a compensating action for rollback). Double-entry bookkeeping ensures every debit has a corresponding credit. The depth topic: reconciliation — how do you detect and resolve discrepancies between your records and the payment processor’s at end of day?

24. Design a search engine (like Google Search)

Why it’s asked: It’s the ultimate full-stack system design problem. Crawling, indexing, ranking, serving — each is a system in itself. Usually asked at Google and sometimes at Amazon.

Key concepts: Inverted index, TF-IDF and BM25 ranking, PageRank, index partitioning (by document vs by term), query parsing, spelling correction, snippet generation, tiered serving.

Approach hint: Three major subsystems: crawl (collect pages), index (build inverted index mapping terms to documents with positions and frequencies), serve (parse query, look up index, rank results). Partition the index by document ranges across servers; scatter-gather at query time. Ranking combines text relevance (BM25) with link analysis (PageRank) and freshness signals. The interesting depth: tiered indexing — serve results from a small “hot” index first (popular pages), only hit the full index if needed. This dramatically reduces serving cost.

25. Design a hotel/flight booking system (like Booking.com)

Why it’s asked: Inventory systems face the double-booking problem, which is a distributed consistency challenge. It also involves search across complex attribute spaces.

Key concepts: Inventory locking (pessimistic vs optimistic), eventual consistency for search vs strong consistency for booking, search indexing with filters, pricing engine, overbooking strategies, payment integration.

Approach hint: Separate the search path (read-heavy, eventually consistent, served from Elasticsearch with denormalized data) from the booking path (strongly consistent, serialized through the database). For booking: optimistic locking with version checks — read the room’s current version, attempt the write, retry if someone else booked first. The business nuance: hotels intentionally overbook by a small percentage, so your system should support configurable overbooking thresholds. Discuss how search results show “3 rooms left” without being a real-time inventory guarantee.

26. Design a proximity service (like Yelp or Google Maps nearby)

Why it’s asked: Geospatial search at scale is non-trivial. This tests your knowledge of spatial indexing, ranking by distance plus relevance, and handling varying query radii.

Key concepts: Geospatial indexing (geohash, quadtree, S2 geometry), radius queries, ranking (distance, rating, relevance), caching by region, dynamic radius expansion, business data ingestion pipeline.

Approach hint: Index businesses using geohash — each business gets a geohash prefix based on its location. For a “restaurants near me” query, compute the user’s geohash and query the surrounding cells. Use a quadtree for variable-density areas (more granularity in Manhattan, less in Montana). Rank results by a weighted score of distance, rating, and relevance. The depth insight: geohash has edge cases at cell boundaries — a restaurant 50 meters away might be in a different cell. Query neighboring cells to handle this.

27. Design a global-scale distributed database (like Spanner)

Why it’s asked: This is the deepest systems question you can get. It tests understanding of consensus protocols, clock synchronization, and the fundamental limits of distributed computing.

Key concepts: Paxos/Raft consensus, TrueTime (GPS + atomic clocks for bounded clock uncertainty), externally consistent transactions, shard management, two-phase commit across shards, read-only transactions at a timestamp.

Approach hint: Data is sharded across regions, each shard replicated via Paxos for fault tolerance. Within a shard, a leader handles reads and writes. Cross-shard transactions use two-phase commit coordinated by a transaction manager. The breakthrough insight from Spanner: TrueTime provides bounded clock uncertainty, enabling externally consistent reads without cross-region coordination by waiting out the uncertainty window. Discuss why this matters — it lets you do globally consistent reads at any replica.

28. Design a real-time gaming leaderboard

Why it’s asked: It sounds simple but involves ranking millions of players with real-time updates. Tests your understanding of sorted data structures at scale and query patterns.

Key concepts: Redis sorted sets, partitioning strategies for global rankings, approximate vs exact rankings, score update throughput, regional vs global leaderboards, time-windowed leaderboards.

Approach hint: Redis sorted sets give O(log N) updates and O(log N) rank queries — perfect for leaderboards up to tens of millions of entries on a single node. For larger scale, partition by score ranges and maintain a count per partition to compute global rank. Time-windowed leaderboards (daily, weekly) need separate sorted sets with TTL-based cleanup. The interesting trade-off: exact global ranking at 100M+ players is expensive — would an approximate rank (“top 1%”) be acceptable for non-competitive contexts?

29. Design a distributed task scheduler (like Airflow at scale)

Why it’s asked: Workflow orchestration involves DAG execution, dependency management, failure recovery, and resource allocation. Common at Google and Amazon where data pipelines are critical infrastructure.

Key concepts: DAG representation, topological sort for execution order, task state machine, distributed locking for task assignment, heartbeat-based failure detection, retry policies, backfill support, resource pool management.

Approach hint: Store workflow DAGs in a database. A scheduler service evaluates DAGs periodically: for each, check if dependencies are met, and enqueue ready tasks to a distributed task queue. Workers pull tasks, execute them, and report status. Use heartbeats to detect stuck workers and reassign their tasks. The complexity: what happens when a task in the middle of a DAG fails? Support configurable policies — retry N times, skip and continue downstream, or fail the entire DAG. Discuss how you’d handle backfills (re-running historical DAG executions).

30. Design a stock exchange matching engine

Why it’s asked: This is arguably the hardest system design question. It demands extreme low-latency, deterministic ordering, and correctness under high throughput. Asked at top-tier financial tech interviews.

Key concepts: Order book (price-time priority), matching algorithms (price-time, pro-rata), sequencer for deterministic ordering, event sourcing, co-location for latency, market data dissemination, circuit breakers.

Approach hint: The core is a single-threaded matching engine processing orders sequentially from a sequencer (deterministic ordering is non-negotiable). The order book maintains buy and sell sides sorted by price, then time. When an incoming order matches (buy price >= sell price), execute the trade and emit events. Event sourcing means the order log IS the source of truth; state can be rebuilt by replaying it. The scaling challenge: you can’t horizontally scale the matching engine for a single instrument because ordering must be deterministic. Scale by sharding across instruments. Discuss how market data (price updates) is disseminated to millions of subscribers with minimal latency.

How to Practice These Questions Effectively

Collecting questions is the easy part. What separates people who get offers from people who collect rejection emails is how they practice.

Use a framework, not a blank canvas. If you sit down with “Design YouTube” and just start drawing, you’ll spend ten minutes staring at the whiteboard. Use a structured approach — scope the problem, estimate scale, define APIs, then architect. If you don’t have a framework yet, the URDGE method is a solid starting point.

Time yourself. Real interviews are 35-45 minutes. Practice under the same constraint. An elegant solution that takes ninety minutes is worthless in an interview context.

Speak out loud. System design is a verbal exercise. You’re not writing code; you’re narrating your thinking. Practice talking through trade-offs, even if it feels awkward alone. Record yourself if you can stomach listening to it afterward.

Go three deep on key topics. Don’t just know that Redis exists. Know when you’d pick Redis over Memcached, what eviction policies it supports, and what happens to your cache during a failover. Three levels of “why” on any technology will cover most interviewer follow-ups.

Practice in pairs. Grab a friend who’s also preparing and take turns interviewing each other. The interviewer role teaches you as much as the candidate role — you start noticing what makes an answer convincing versus hand-wavy.

And keep track of your answers. For a fill-in-the-blank structure you can reuse across all thirty questions, check out our system design answer template.

FAQ

Q: Do FAANG companies actually ask these exact questions?

Most of them, yes — or close variants. “Design a chat system” might become “Design Facebook Messenger” or “Design Slack,” but the core problem is identical. Companies rotate their question pools, but the underlying patterns have been stable for years. Preparing for the pattern gives you more coverage than memorizing company-specific questions.

Q: How many of these should I practice before interviewing?

Eight to ten, spread across all three difficulty tiers. The goal isn’t to have a canned answer for every question — it’s to build fluency with the building blocks (caching, queuing, partitioning, replication) so you can assemble them on the fly. After eight well-practiced problems, you’ll recognize the patterns in any new question.

Q: Should I memorize architectures or understand the building blocks?

Building blocks, without question. Memorized architectures crumble under follow-up questions. If you truly understand consistent hashing, you can apply it to a distributed cache, a CDN, or a partition strategy — even if you’ve never seen the specific question before. The building blocks are the transferable skill; the architectures are just assemblies.

Q: What’s the biggest mistake candidates make in system design interviews?

Jumping to the solution. They hear “Design Uber” and immediately start drawing microservices. The best candidates spend the first five to ten minutes asking questions, defining scope, and estimating scale. By the time they pick up the marker, they know exactly what they’re building and why. That upfront investment saves time and produces a focused, coherent design instead of a sprawling mess.

System design is one piece of the puzzle. For a complete prep plan covering coding rounds, behavioral interviews, and logistics, read how to prepare for a technical interview. And when you’re ready to structure your practice answers, grab the system design answer template.

Done reading? Join the early access —>

30 System Design Interview Questions from FAANG (With Approach Hints)

30 System Design Interview Questions from FAANG (With Approach Hints)

Beginner (Questions 1-10)

1. Design a URL shortener (like Bitly)

2. Design a paste tool (like Pastebin)

3. Design a rate limiter

4. Design a key-value store

5. Design a notification system

6. Design a unique ID generator (distributed)

7. Design a task queue (like Celery or SQS)

8. Design an API rate limiter for a cloud service

9. Design a content delivery network (CDN)

10. Design a URL crawler (basic web crawler)

Intermediate (Questions 11-20)

11. Design a chat system (like WhatsApp or Slack)

12. Design a news feed (like Facebook or Twitter)

13. Design Instagram

14. Design a search autocomplete system

15. Design a web crawler at scale (Google-scale)

16. Design a file storage system (like Google Drive or Dropbox)

17. Design a video streaming platform (like YouTube or Netflix)

18. Design a ride-sharing service (like Uber or Lyft)

19. Design a metrics monitoring system (like Datadog or Prometheus)

20. Design a collaborative document editor (like Google Docs)

Advanced (Questions 21-30)

21. Design a distributed message queue (like Kafka)

22. Design a distributed cache (like Memcached/Redis cluster)

23. Design a payment system (like Stripe)

24. Design a search engine (like Google Search)

25. Design a hotel/flight booking system (like Booking.com)

26. Design a proximity service (like Yelp or Google Maps nearby)

27. Design a global-scale distributed database (like Spanner)

28. Design a real-time gaming leaderboard

29. Design a distributed task scheduler (like Airflow at scale)

30. Design a stock exchange matching engine

How to Practice These Questions Effectively

FAQ

Ready to ace your next interview?