When to Use What
A Decision Guide for Data Structures, Algorithms & System Design Patterns
Data Structures
Hash Table / HashMap
IF YOU SEE: Need O(1) lookup by key, frequency counting, deduplication, grouping by key, two-sum style problems, caching key-value pairs
USE: Hash Table / HashMap / Dictionary
WHY: O(1) average insert, lookup, delete. Perfect for key-based access when you don't need ordering.
TRADE-OFF: No ordering, O(n) worst case with collisions, more memory than arrays
EXAMPLES: LRU cache, symbol tables, database indexes (hash index), counting word frequency, detecting duplicates
Array / Dynamic Array
IF YOU SEE: Need O(1) access by index, sequential iteration, cache-friendly processing, storing ordered elements, sliding window problems
USE: Array / ArrayList / Vector
WHY: O(1) random access, excellent cache locality, simple and memory-efficient
TRADE-OFF: O(n) insert/delete in middle, fixed size (or amortized resize cost)
EXAMPLES: Pixel buffers, lookup tables, ring buffers, sliding window algorithms
Linked List
IF YOU SEE: Frequent insertions/deletions at arbitrary positions, need to maintain insertion order with O(1) removal, implementing LRU cache (with hash map)
USE: Doubly Linked List
WHY: O(1) insert/delete when you have a pointer to the node, no shifting of elements
TRADE-OFF: O(n) search, poor cache locality, extra memory for pointers
EXAMPLES: LRU cache (HashMap + DLL), undo history, music playlist navigation
Stack (LIFO)
IF YOU SEE: Matching parentheses, undo/redo, DFS traversal, expression evaluation, backtracking, "most recent" access pattern, nested structures
USE: Stack
WHY: O(1) push/pop, naturally handles nested/recursive structures, last-in-first-out
TRADE-OFF: Only access top element, O(n) to search
EXAMPLES: Function call stack, browser back button, parsing expressions, valid parentheses check
Queue (FIFO)
IF YOU SEE: BFS traversal, task scheduling, maintaining order of arrival, producer-consumer pattern, level-order tree traversal
USE: Queue / Deque
WHY: O(1) enqueue/dequeue, first-in-first-out guarantees fairness
TRADE-OFF: Only access front/back, O(n) to search
EXAMPLES: BFS, job queues (Celery, SQS), print queue, request buffering
Heap / Priority Queue
IF YOU SEE: "K largest/smallest", scheduling by priority, median finding, merge K sorted lists, Dijkstra's algorithm, "top K" problems
USE: Min-Heap or Max-Heap (Priority Queue)
WHY: O(log n) insert, O(1) peek min/max, O(log n) extract min/max. Efficient for maintaining extremes.
TRADE-OFF: O(n) search for arbitrary element, not sorted (only root is min/max)
EXAMPLES: Task scheduler, Dijkstra's shortest path, find K closest points, running median
Binary Search Tree (BST) / Balanced Tree
IF YOU SEE: Need sorted order + fast lookup, range queries, predecessor/successor queries, sorted iteration
USE: Balanced BST (Red-Black Tree, AVL Tree) or TreeMap/TreeSet
WHY: O(log n) insert/search/delete + maintains sorted order. Supports range queries.
TRADE-OFF: Slower than hash table for simple lookups, more complex implementation
EXAMPLES: Database indexes (B-Tree), in-memory sorted collections, interval trees
B-Tree / B+ Tree
IF YOU SEE: Disk-based storage, database indexes, need to minimize disk I/O, range scans on large datasets
USE: B-Tree or B+ Tree
WHY: High branching factor minimizes tree height → fewer disk reads. Leaf linking enables efficient range scans.
TRADE-OFF: More complex than BST, overhead for small datasets
EXAMPLES: PostgreSQL/MySQL indexes, filesystem directories (ext4, NTFS), key-value stores
Trie (Prefix Tree)
IF YOU SEE: Autocomplete, prefix matching, spell checking, IP routing (longest prefix match), word search in dictionary
USE: Trie
WHY: O(k) lookup where k = key length. Shared prefixes save space. Natural for prefix queries.
TRADE-OFF: High memory overhead (many pointers), only useful for string/sequence data
EXAMPLES: Search autocomplete, DNS lookup, phone directory, word games
Graph (Adjacency List)
IF YOU SEE: Relationships between entities, networks, paths/routes, dependencies, social connections, sparse connections
USE: Adjacency List
WHY: O(V + E) space, efficient for sparse graphs (most real-world graphs), fast neighbor iteration
TRADE-OFF: O(degree) to check if edge exists, slower than matrix for dense graphs
EXAMPLES: Social networks, road maps, web crawling, dependency resolution
Bloom Filter
IF YOU SEE: "Probably exists" checks, avoiding expensive lookups, membership testing at scale, can tolerate false positives but NOT false negatives
USE: Bloom Filter
WHY: O(k) lookup, extremely space-efficient (~10 bits per element for 1% false positive). "No" is definitive.
TRADE-OFF: False positives possible, cannot delete elements, cannot retrieve stored values
EXAMPLES: Database query optimization (avoid disk reads), malicious URL checking, CDN cache checks, spell checkers
Skip List
IF YOU SEE: Need sorted data with simpler implementation than balanced tree, concurrent access, probabilistic balance
USE: Skip List
WHY: O(log n) average operations, simpler than red-black trees, easy to make lock-free
TRADE-OFF: Probabilistic (not guaranteed O(log n)), more space than BST
EXAMPLES: Redis sorted sets, LevelDB/RocksDB memtables, concurrent data structures
Data Structure Quick Reference
| Problem Pattern |
Data Structure |
Key Operations |
| O(1) key lookup |
Hash Table |
get, put, delete |
| Sorted data + range queries |
BST / TreeMap |
floor, ceiling, range |
| Min/max tracking |
Heap |
insert, extractMin/Max |
| LIFO / nested structures |
Stack |
push, pop, peek |
| FIFO / level-by-level |
Queue |
enqueue, dequeue |
| Prefix matching |
Trie |
insert, startsWith |
| Relationships / paths |
Graph |
addEdge, neighbors, BFS/DFS |
| Membership at scale |
Bloom Filter |
add, mightContain |
| Disk-based indexes |
B-Tree |
search, insert, range |
| O(1) insert + O(1) remove |
HashMap + DLL (LRU) |
get, put, evict |
Algorithms
Binary Search
IF YOU SEE: Sorted array, "find position", "find first/last", monotonic function, "minimum/maximum that satisfies condition"
USE: Binary Search
WHY: O(log n) - eliminates half the search space each iteration
TRADE-OFF: Requires sorted data or monotonic property
EXAMPLES: Search in sorted array, find insert position, search rotated array, find peak element, capacity scheduling
Two Pointers
IF YOU SEE: Sorted array, find pairs summing to target, remove duplicates, partition array, palindrome check, container with most water
USE: Two Pointers (start/end or slow/fast)
WHY: O(n) single pass, O(1) space, exploits sorted property or positional relationships
TRADE-OFF: Usually requires sorted input or specific structure
EXAMPLES: Two sum (sorted), 3sum, remove duplicates, linked list cycle detection, merge sorted arrays
Sliding Window
IF YOU SEE: Contiguous subarray/substring, "maximum/minimum of size K", "longest substring with condition", stream processing
USE: Sliding Window (fixed or variable size)
WHY: O(n) - each element enters and exits window once, avoids recomputation
TRADE-OFF: Only works for contiguous sequences
EXAMPLES: Max sum subarray of size K, longest substring without repeating chars, minimum window substring
BFS (Breadth-First Search)
IF YOU SEE: Shortest path (unweighted), level-order traversal, "minimum steps", nearest neighbor, spreading/infection simulation
USE: BFS with Queue
WHY: Explores level by level, guarantees shortest path in unweighted graphs
TRADE-OFF: O(V + E) time, O(V) space for queue (can be large for wide graphs)
EXAMPLES: Shortest path in maze, word ladder, binary tree level order, rotten oranges, social network degrees
DFS (Depth-First Search)
IF YOU SEE: Explore all paths, detect cycles, topological sort, connected components, backtracking, tree traversals (in/pre/post-order)
USE: DFS with Stack/Recursion
WHY: Natural for exhaustive search, uses O(height) stack space, simple recursive implementation
TRADE-OFF: Doesn't find shortest path, can stack overflow on deep graphs
EXAMPLES: Detect cycle, topological sort, number of islands, path sum, generate permutations
Dijkstra's Algorithm
IF YOU SEE: Shortest path with non-negative weights, single-source shortest path, network routing
USE: Dijkstra's Algorithm with Min-Heap
WHY: O((V + E) log V) with heap, optimal for graphs with non-negative weights
TRADE-OFF: Doesn't work with negative weights (use Bellman-Ford instead)
EXAMPLES: GPS navigation, network routing (OSPF), cheapest flights
Dynamic Programming
IF YOU SEE: Optimal substructure + overlapping subproblems, "minimum/maximum cost", "number of ways", "is it possible", sequence alignment
USE: Dynamic Programming (top-down memoization or bottom-up tabulation)
WHY: Avoids recomputation of subproblems, reduces exponential to polynomial time
TRADE-OFF: Extra space for memoization table, need to identify state and transitions
EXAMPLES: Fibonacci, knapsack, longest common subsequence, edit distance, coin change
Topological Sort
IF YOU SEE: Dependencies, prerequisites, build order, task scheduling with constraints, course schedule
USE: Topological Sort (Kahn's BFS or DFS)
WHY: O(V + E), produces valid ordering respecting dependencies
TRADE-OFF: Only works on DAGs (directed acyclic graphs)
EXAMPLES: Build systems (Make, Gradle), course scheduling, package dependencies
Union-Find (Disjoint Set)
IF YOU SEE: "Are X and Y connected?", grouping/clustering, connected components, cycle detection in undirected graph, Kruskal's MST
USE: Union-Find with path compression + union by rank
WHY: Nearly O(1) per operation (amortized), efficient for dynamic connectivity
TRADE-OFF: Can't efficiently split groups, only tracks connectivity (not paths)
EXAMPLES: Number of connected components, accounts merge, redundant connection
Greedy
IF YOU SEE: Local optimal choice leads to global optimal, interval scheduling, Huffman coding, "minimum number of X to cover Y"
USE: Greedy Algorithm
WHY: Simple, often O(n log n), makes locally optimal choices
TRADE-OFF: Must prove greedy choice property, doesn't always give optimal solution
EXAMPLES: Activity selection, Huffman encoding, minimum platforms, jump game
Backtracking
IF YOU SEE: Generate all combinations/permutations, constraint satisfaction (Sudoku, N-Queens), "find all solutions", decision tree exploration
USE: Backtracking (DFS + pruning)
WHY: Systematically explores solution space, prunes invalid branches early
TRADE-OFF: Can be exponential, effectiveness depends on pruning quality
EXAMPLES: N-Queens, Sudoku solver, generate parentheses, word search, subsets
Algorithm Selection Quick Reference
| Problem Type |
Algorithm |
Time Complexity |
| Search in sorted data |
Binary Search |
O(log n) |
| Shortest path (unweighted) |
BFS |
O(V + E) |
| Shortest path (weighted, non-negative) |
Dijkstra |
O((V+E) log V) |
| Shortest path (negative weights) |
Bellman-Ford |
O(V × E) |
| All-pairs shortest path |
Floyd-Warshall |
O(V³) |
| Minimum spanning tree |
Kruskal / Prim |
O(E log E) |
| Contiguous subarray |
Sliding Window |
O(n) |
| Optimal substructure |
Dynamic Programming |
Varies (polynomial) |
| Dependencies/ordering |
Topological Sort |
O(V + E) |
| Dynamic connectivity |
Union-Find |
O(α(n)) ≈ O(1) |
| All combinations/permutations |
Backtracking |
O(2ⁿ) or O(n!) |
Rate Limiting
Token Bucket
IF YOU SEE: Need to allow bursts, smooth out traffic, API rate limiting with burst capacity, network traffic shaping
USE: Token Bucket Algorithm
WHY: Allows temporary bursts up to bucket capacity while maintaining average rate. Tokens refill at constant rate.
TRADE-OFF: Allows bursts (might not want this), need to tune bucket size and refill rate
EXAMPLES: AWS API Gateway, network traffic shaping, Stripe API rate limiting
Bucket: [●●●●●○○○○○] capacity=10, tokens=5
Request → take token → [●●●●○○○○○○]
Refill every 100ms → [●●●●●○○○○○]
Burst of 5 requests → [○○○○○○○○○○] (allowed!)
Next request → DENIED (no tokens)
Leaky Bucket
IF YOU SEE: Need constant output rate, smoothing bursty input, network congestion control, queue-based processing
USE: Leaky Bucket Algorithm
WHY: Processes requests at constant rate regardless of input rate. Smooths traffic perfectly.
TRADE-OFF: Doesn't allow bursts, requests may wait in queue, can drop requests if queue full
EXAMPLES: Network traffic shaping, video streaming bitrate control
Input (bursty): ●●●●●...........●●●●●
Leaky Bucket: ●.●.●.●.●.●.●.●.●.●.●
Output: Constant rate processing
Fixed Window Counter
IF YOU SEE: Simple rate limiting, "X requests per minute/hour", don't need precision at window boundaries
USE: Fixed Window Counter
WHY: Simple to implement, O(1) space per user, easy to understand
TRADE-OFF: Boundary problem - 2x burst possible at window edges (100 at :59, 100 at :00)
EXAMPLES: Basic API rate limiting, simple request counters
Window: [00:00 - 01:00] limit=100
Requests at 00:59: 100 (allowed)
Requests at 01:00: 100 (allowed - new window!)
= 200 requests in 2 seconds (boundary burst)
Sliding Window Log
IF YOU SEE: Need precise rate limiting, no boundary bursts, exact "last N minutes" counting
USE: Sliding Window Log (store timestamps)
WHY: Most accurate - stores timestamp of each request, counts requests in exact sliding window
TRADE-OFF: O(n) space per user (stores all timestamps), expensive for high-volume
EXAMPLES: High-precision rate limiting, security-critical limits
Sliding Window Counter
IF YOU SEE: Need smooth rate limiting without boundary bursts, memory-efficient, good approximation
USE: Sliding Window Counter (weighted average of windows)
WHY: Combines fixed window simplicity with sliding window smoothness. O(1) space.
TRADE-OFF: Approximation (not exact), slightly more complex than fixed window
EXAMPLES: Most production rate limiters (Redis-based), Cloudflare
Current window: 70% through
Previous window count: 40
Current window count: 30
Weighted: 40 × 0.3 + 30 = 42 requests in sliding window
Rate Limiting Selection
| Requirement |
Algorithm |
Memory |
| Allow bursts, smooth average |
Token Bucket |
O(1) |
| Constant output rate |
Leaky Bucket |
O(queue size) |
| Simple, approximate |
Fixed Window |
O(1) |
| Precise, no bursts |
Sliding Window Log |
O(requests) |
| Smooth, memory-efficient |
Sliding Window Counter |
O(1) |
Caching Strategies
Cache-Aside (Lazy Loading)
IF YOU SEE: Read-heavy workload, can tolerate cache miss latency, data that's expensive to compute/fetch
USE: Cache-Aside Pattern
WHY: Only caches data that's actually requested, application controls caching logic
TRADE-OFF: Cache miss is slow (DB read + cache write), potential stale data, cache stampede risk
EXAMPLES: User profile caching, product catalog, session storage
Read:
1. Check cache → HIT? Return
2. MISS? Read from DB
3. Write to cache
4. Return data
Write-Through
IF YOU SEE: Need strong consistency, can't tolerate stale reads, write latency acceptable
USE: Write-Through Cache
WHY: Cache always consistent with database, no stale reads
TRADE-OFF: Higher write latency (write to both cache + DB), may cache unused data
EXAMPLES: User settings, shopping cart, inventory counts
Write:
1. Write to cache
2. Cache writes to DB (synchronously)
3. Return success
Write-Behind (Write-Back)
IF YOU SEE: Write-heavy workload, can tolerate eventual consistency, need low write latency
USE: Write-Behind Cache
WHY: Very fast writes (only to cache), batches DB writes, reduces DB load
TRADE-OFF: Risk of data loss if cache fails before DB write, complexity, eventual consistency
EXAMPLES: Gaming leaderboards, analytics counters, activity feeds
Write:
1. Write to cache → Return immediately
2. Background: Batch writes to DB periodically
Read-Through
IF YOU SEE: Want cache to manage DB reads, simpler application code, read-heavy
USE: Read-Through Cache
WHY: Cache handles DB reads transparently, application only talks to cache
TRADE-OFF: First request always slow, cache must understand data model
EXAMPLES: ORM-level caching, CDN origin fetching
LRU (Least Recently Used)
IF YOU SEE: Limited cache size, recent items more likely to be accessed, general-purpose eviction
USE: LRU Eviction Policy
WHY: Keeps frequently accessed items, good for most workloads, O(1) with HashMap + DLL
TRADE-OFF: Can evict items about to be reused (scan resistance), doesn't consider frequency
EXAMPLES: Redis (approximated LRU), Memcached, CPU caches, browser cache
LFU (Least Frequently Used)
IF YOU SEE: Some items much more popular than others, want to keep hot items, frequency matters more than recency
USE: LFU Eviction Policy
WHY: Keeps popular items in cache, better hit rate when popularity varies significantly
TRADE-OFF: Slow to adapt to changing popularity, more complex than LRU
EXAMPLES: CDN caching for viral content, DNS caching
TTL (Time-To-Live)
IF YOU SEE: Data has known staleness tolerance, need automatic expiration, time-based invalidation
USE: TTL-based Expiration
WHY: Simple, automatic invalidation, bounds staleness
TRADE-OFF: May evict still-valid data, need to choose TTL carefully
EXAMPLES: Session tokens, API responses, DNS records
Caching Strategy Selection
| Workload |
Strategy |
Consistency |
| Read-heavy, occasional writes |
Cache-Aside |
Eventual |
| Must be consistent |
Write-Through |
Strong |
| Write-heavy, latency-sensitive |
Write-Behind |
Eventual |
| General eviction |
LRU |
N/A |
| Hot/cold data |
LFU |
N/A |
| Known staleness tolerance |
TTL |
Time-bounded |
Database Selection
Relational Database (PostgreSQL, MySQL)
IF YOU SEE: ACID transactions needed, complex queries with JOINs, structured data with relationships, data integrity critical
USE: Relational Database (PostgreSQL, MySQL)
WHY: ACID guarantees, powerful SQL queries, mature tooling, enforced schema
TRADE-OFF: Harder to scale horizontally, schema changes can be painful, vertical scaling limits
EXAMPLES: Financial systems, e-commerce orders, user accounts, inventory management
Document Database (MongoDB, CouchDB)
IF YOU SEE: Flexible/evolving schema, document-oriented data, rapid development, denormalized data
USE: Document Database (MongoDB)
WHY: Schema flexibility, horizontal scaling, JSON-native, fast development iteration
TRADE-OFF: No JOINs (denormalize), eventual consistency (often), data duplication
EXAMPLES: Content management, user profiles, product catalogs, event logging
Key-Value Store (Redis, DynamoDB)
IF YOU SEE: Simple key → value access, extreme performance needed, caching, session storage, real-time data
USE: Key-Value Store (Redis, DynamoDB)
WHY: O(1) operations, sub-millisecond latency (Redis), massive horizontal scale (DynamoDB)
TRADE-OFF: No complex queries, limited data modeling, no relationships
EXAMPLES: Session storage, caching, rate limiting counters, leaderboards
Wide-Column Store (Cassandra, HBase)
IF YOU SEE: Write-heavy workload, time-series data, high availability priority, known query patterns, massive scale
USE: Wide-Column Store (Cassandra, HBase)
WHY: Optimized for writes, horizontal scaling, tunable consistency, great for time-series
TRADE-OFF: Limited query flexibility, must design for access patterns, eventual consistency
EXAMPLES: IoT sensor data, messaging apps (Discord), analytics, activity logs
Graph Database (Neo4j, Neptune)
IF YOU SEE: Highly connected data, relationship queries, traversals, "friends of friends", recommendation engines
USE: Graph Database (Neo4j)
WHY: Native graph storage, efficient traversals, intuitive for relationship-heavy domains
TRADE-OFF: Not good for non-graph queries, scaling can be complex, smaller ecosystem
EXAMPLES: Social networks, fraud detection, knowledge graphs, recommendation engines
Search Engine (Elasticsearch)
IF YOU SEE: Full-text search, fuzzy matching, faceted search, log aggregation, relevance scoring
USE: Search Engine (Elasticsearch, Solr)
WHY: Inverted index for fast text search, relevance ranking, aggregations, near real-time
TRADE-OFF: Not a primary database, eventual consistency, complex to operate
EXAMPLES: Product search, log analysis (ELK stack), autocomplete
Time-Series Database (InfluxDB, TimescaleDB)
IF YOU SEE: Metrics, monitoring data, IoT sensors, timestamps are primary key, downsampling needed
USE: Time-Series Database
WHY: Optimized for time-based writes and queries, automatic downsampling, compression
TRADE-OFF: Limited for non-time-series queries, specialized use case
EXAMPLES: Application metrics, infrastructure monitoring, stock prices, IoT telemetry
Database Selection Quick Reference
| Use Case |
Database Type |
Examples |
| Transactions, complex queries |
Relational |
PostgreSQL, MySQL |
| Flexible schema, documents |
Document |
MongoDB, CouchDB |
| Caching, sessions |
Key-Value |
Redis, Memcached |
| Massive scale, simple access |
Key-Value (managed) |
DynamoDB, Cassandra |
| Time-series, IoT |
Wide-Column / TSDB |
Cassandra, InfluxDB |
| Relationships, traversals |
Graph |
Neo4j, Neptune |
| Full-text search |
Search Engine |
Elasticsearch, Solr |
Messaging & Queues
Message Queue (RabbitMQ, SQS)
IF YOU SEE: Task distribution, work queues, one consumer per message, decoupling services, job processing
USE: Message Queue (RabbitMQ, SQS)
WHY: Reliable delivery, message deleted after consumption, good for task distribution
TRADE-OFF: Messages consumed once, no replay, limited retention
EXAMPLES: Email sending, image processing, order fulfillment, background jobs
Event Streaming (Kafka)
IF YOU SEE: Event sourcing, multiple consumers need same events, replay needed, high throughput, log aggregation
USE: Event Streaming (Kafka, Kinesis)
WHY: Persistent log, multiple consumer groups, replay from any offset, massive throughput
TRADE-OFF: More complex to operate, consumers manage their offsets
EXAMPLES: Event sourcing, activity streams, log aggregation, real-time analytics, CDC
Pub/Sub (Redis, Google Pub/Sub)
IF YOU SEE: Broadcast to multiple subscribers, real-time notifications, fan-out pattern, fire-and-forget
USE: Pub/Sub System
WHY: Simple pub/sub semantics, broadcast to all subscribers, real-time
TRADE-OFF: No persistence (messages lost if no subscriber), at-most-once delivery
EXAMPLES: Chat room broadcasts, live updates, cache invalidation
Messaging Selection
| Pattern |
System |
Delivery |
Replay |
| Work queue (one consumer) |
RabbitMQ, SQS |
At-least-once |
No |
| Event log (multi-consumer) |
Kafka, Kinesis |
At-least-once |
Yes |
| Broadcast (fan-out) |
Redis Pub/Sub |
At-most-once |
No |
| Managed, serverless |
SQS, SNS, Pub/Sub |
Configurable |
Limited |
Distributed Systems Patterns
Consistent Hashing
IF YOU SEE: Distributing data across nodes, need to add/remove nodes without reshuffling all data, load balancing
USE: Consistent Hashing
WHY: Only K/n keys move when adding/removing nodes (vs ALL keys with modulo hashing)
TRADE-OFF: More complex than modulo, need virtual nodes for even distribution
EXAMPLES: Distributed caches (Memcached), CDNs, database sharding, load balancers
Sharding
IF YOU SEE: Database too large for single server, need to scale writes, data can be partitioned by key
USE: Horizontal Sharding
WHY: Linear write scalability, each shard handles subset of data
TRADE-OFF: Cross-shard queries expensive, rebalancing complex, application complexity
EXAMPLES: Shard by user_id, tenant_id, geographic region
Leader Election
IF YOU SEE: Need single coordinator, distributed lock, scheduled job runner, avoiding split-brain
USE: Leader Election (Raft, Zookeeper, etcd)
WHY: Ensures single leader for coordination, handles leader failure
TRADE-OFF: Leader is bottleneck, failover takes time, complexity
EXAMPLES: Database primary election, job scheduler, distributed locks
Circuit Breaker
IF YOU SEE: Cascading failures, failing service calls, need to fail fast, protect downstream services
USE: Circuit Breaker Pattern
WHY: Stops calling failing service, allows recovery time, fails fast
TRADE-OFF: Need to tune thresholds, may reject valid requests during half-open state
EXAMPLES: Hystrix, Resilience4j, service mesh circuit breaking
States:
CLOSED → (failures > threshold) → OPEN
OPEN → (timeout) → HALF-OPEN
HALF-OPEN → (success) → CLOSED
HALF-OPEN → (failure) → OPEN
Saga Pattern
IF YOU SEE: Distributed transactions across services, need compensation on failure, long-running transactions
USE: Saga Pattern (choreography or orchestration)
WHY: Maintains consistency without distributed locks, each step has compensating action
TRADE-OFF: Eventually consistent, complex failure handling, need idempotent operations
EXAMPLES: Order → Payment → Inventory → Shipping (rollback on failure)
CQRS (Command Query Responsibility Segregation)
IF YOU SEE: Different read/write patterns, need to optimize reads and writes separately, complex queries on different model
USE: CQRS Pattern
WHY: Optimize read and write models independently, scale reads/writes separately
TRADE-OFF: Eventual consistency between models, more complex architecture
EXAMPLES: E-commerce (write orders, read product views), reporting systems
Event Sourcing
IF YOU SEE: Need complete audit trail, time-travel queries, rebuild state from events, domain-driven design
USE: Event Sourcing
WHY: Complete history, can replay events, natural audit log
TRADE-OFF: Eventual consistency, event schema evolution, higher storage
EXAMPLES: Banking transactions, version control, shopping cart history
Search & Indexing
B-Tree Index
IF YOU SEE: Range queries, ordered access, equality + comparison operators, most database queries
USE: B-Tree Index (default in most databases)
WHY: Efficient for both equality and range queries, sorted order
TRADE-OFF: Slower than hash for pure equality, write overhead to maintain
EXAMPLES: WHERE date BETWEEN x AND y, ORDER BY, WHERE status = 'active'
Hash Index
IF YOU SEE: Only equality queries, no range queries needed, exact key lookup
USE: Hash Index
WHY: O(1) lookup for equality, slightly faster than B-Tree for exact matches
TRADE-OFF: Cannot do range queries, ORDER BY, or partial matching
EXAMPLES: WHERE id = 123, primary key lookups
Inverted Index
IF YOU SEE: Full-text search, "find documents containing word X", text search
USE: Inverted Index (Elasticsearch, full-text search)
WHY: Maps terms → documents, enables fast text search
TRADE-OFF: Storage overhead, index maintenance, eventual consistency
EXAMPLES: Search engines, document search, log analysis
Bitmap Index
IF YOU SEE: Low cardinality columns, analytics queries, many AND/OR conditions
USE: Bitmap Index
WHY: Efficient for low cardinality, fast bitwise operations for complex conditions
TRADE-OFF: Poor for high cardinality, write-heavy workloads
EXAMPLES: WHERE status IN ('A', 'B') AND region = 'US', data warehouses
GIN Index (Generalized Inverted)
IF YOU SEE: JSONB queries, array contains, full-text search in PostgreSQL
USE: GIN Index
WHY: Handles composite values (arrays, JSONB), efficient containment queries
TRADE-OFF: Slower writes, larger index size
EXAMPLES: WHERE tags @> '{"featured"}', WHERE data->>'type' = 'x'
Real-Time Communication
Polling
IF YOU SEE: Simple requirements, infrequent updates, can tolerate delay, legacy systems
USE: Polling (Short Polling)
WHY: Simple to implement, works everywhere, no special server support
TRADE-OFF: Wasteful (many empty responses), higher latency, server load
EXAMPLES: Checking for email, simple status updates
Long Polling
IF YOU SEE: Need lower latency than polling, can't use WebSockets, moderate real-time needs
USE: Long Polling
WHY: Server holds request until data available, lower latency than polling
TRADE-OFF: Ties up connections, more complex than polling, still one-way initiated
EXAMPLES: Chat applications (legacy), notifications
Server-Sent Events (SSE)
IF YOU SEE: Server → client only, live feeds, notifications, simple real-time updates
USE: Server-Sent Events (SSE)
WHY: Simple, auto-reconnect, event IDs for resumption, HTTP-based
TRADE-OFF: One-way only (server to client), limited browser connections per domain
EXAMPLES: Live scores, stock tickers, social media feeds, notifications
WebSockets
IF YOU SEE: Bidirectional real-time, chat, gaming, collaborative editing, low latency critical
USE: WebSockets
WHY: Full-duplex, low overhead after handshake, true real-time
TRADE-OFF: More complex, need sticky sessions or pub/sub for scaling, firewall issues
EXAMPLES: Chat apps, multiplayer games, collaborative docs, live trading
Real-Time Selection
| Requirement |
Technology |
Direction |
| Simple, tolerates delay |
Polling |
Client → Server |
| Lower latency, fallback |
Long Polling |
Client → Server |
| Server push, simple |
SSE |
Server → Client |
| Bidirectional, low latency |
WebSockets |
Both directions |
Consistency Models
Strong Consistency
IF YOU SEE: Financial transactions, inventory counts, can't tolerate stale reads, read-after-write must work
USE: Strong Consistency (synchronous replication, consensus)
WHY: All reads see latest write, linearizable operations
TRADE-OFF: Higher latency, reduced availability during partitions (CAP theorem)
EXAMPLES: Bank transfers, ticket booking, primary database
Eventual Consistency
IF YOU SEE: High availability priority, can tolerate temporary staleness, global distribution
USE: Eventual Consistency
WHY: Higher availability, lower latency, better for geo-distribution
TRADE-OFF: May read stale data, conflict resolution needed, harder to reason about
EXAMPLES: Social media likes, DNS, shopping cart, user preferences
Causal Consistency
IF YOU SEE: Need to preserve cause-effect ordering, comments after posts, replies in order
USE: Causal Consistency (vector clocks, logical timestamps)
WHY: Preserves causality without full strong consistency overhead
TRADE-OFF: More complex than eventual, less available than eventual
EXAMPLES: Comment threads, collaborative editing, messaging
Read-Your-Writes Consistency
IF YOU SEE: User must see their own updates immediately, profile updates, settings changes
USE: Read-Your-Writes (session consistency, read from leader)
WHY: User sees their changes, acceptable for others to see eventual state
TRADE-OFF: May need sticky sessions, read from leader increases load
EXAMPLES: Profile updates, post submission, settings changes
Consistency Selection
| Requirement |
Model |
CAP Trade-off |
| Must see latest data |
Strong (CP) |
Sacrifice availability |
| High availability priority |
Eventual (AP) |
Sacrifice consistency |
| Preserve causality |
Causal |
Middle ground |
| User sees own writes |
Read-Your-Writes |
Per-user consistency |
Quick Decision Flowchart
Data Structure Selection
Need O(1) lookup by key? → Hash Table
Need sorted data? → BST / TreeMap
Need min/max quickly? → Heap
Need LIFO? → Stack
Need FIFO? → Queue
Need prefix matching? → Trie
Need relationships? → Graph
Need membership test at scale? → Bloom Filter
Database Selection
Need ACID transactions? → RDBMS (PostgreSQL)
Need flexible schema? → Document DB (MongoDB)
Need extreme speed, simple access? → Key-Value (Redis)
Need massive write throughput? → Wide-Column (Cassandra)
Need relationship queries? → Graph DB (Neo4j)
Need full-text search? → Search Engine (Elasticsearch)
Caching Strategy
Read-heavy, can tolerate miss? → Cache-Aside
Must be consistent? → Write-Through
Write-heavy, eventual OK? → Write-Behind
General eviction? → LRU
Hot/cold data? → LFU
Rate Limiting
Allow bursts? → Token Bucket
Constant rate output? → Leaky Bucket
Simple, approximate? → Fixed Window
Smooth, memory-efficient? → Sliding Window Counter
Real-Time Communication
Server → Client only? → SSE
Bidirectional needed? → WebSockets
Simple, tolerates delay? → Polling
Fallback for WebSocket? → Long Polling