When to Use What

A Decision Guide for Data Structures, Algorithms & System Design Patterns

Data Structures Algorithms Rate Limiting Caching Databases Messaging Distributed Systems Search & Indexing Real-Time Consistency ← Home

Data Structures

Hash Table / HashMap

IF YOU SEE: Need O(1) lookup by key, frequency counting, deduplication, grouping by key, two-sum style problems, caching key-value pairs

USE: Hash Table / HashMap / Dictionary

WHY: O(1) average insert, lookup, delete. Perfect for key-based access when you don't need ordering.

TRADE-OFF: No ordering, O(n) worst case with collisions, more memory than arrays

EXAMPLES: LRU cache, symbol tables, database indexes (hash index), counting word frequency, detecting duplicates

Array / Dynamic Array

IF YOU SEE: Need O(1) access by index, sequential iteration, cache-friendly processing, storing ordered elements, sliding window problems

USE: Array / ArrayList / Vector

WHY: O(1) random access, excellent cache locality, simple and memory-efficient

TRADE-OFF: O(n) insert/delete in middle, fixed size (or amortized resize cost)

EXAMPLES: Pixel buffers, lookup tables, ring buffers, sliding window algorithms

Linked List

IF YOU SEE: Frequent insertions/deletions at arbitrary positions, need to maintain insertion order with O(1) removal, implementing LRU cache (with hash map)

USE: Doubly Linked List

WHY: O(1) insert/delete when you have a pointer to the node, no shifting of elements

TRADE-OFF: O(n) search, poor cache locality, extra memory for pointers

EXAMPLES: LRU cache (HashMap + DLL), undo history, music playlist navigation

Stack (LIFO)

IF YOU SEE: Matching parentheses, undo/redo, DFS traversal, expression evaluation, backtracking, "most recent" access pattern, nested structures

USE: Stack

WHY: O(1) push/pop, naturally handles nested/recursive structures, last-in-first-out

TRADE-OFF: Only access top element, O(n) to search

EXAMPLES: Function call stack, browser back button, parsing expressions, valid parentheses check

Queue (FIFO)

IF YOU SEE: BFS traversal, task scheduling, maintaining order of arrival, producer-consumer pattern, level-order tree traversal

USE: Queue / Deque

WHY: O(1) enqueue/dequeue, first-in-first-out guarantees fairness

TRADE-OFF: Only access front/back, O(n) to search

EXAMPLES: BFS, job queues (Celery, SQS), print queue, request buffering

Heap / Priority Queue

IF YOU SEE: "K largest/smallest", scheduling by priority, median finding, merge K sorted lists, Dijkstra's algorithm, "top K" problems

USE: Min-Heap or Max-Heap (Priority Queue)

WHY: O(log n) insert, O(1) peek min/max, O(log n) extract min/max. Efficient for maintaining extremes.

TRADE-OFF: O(n) search for arbitrary element, not sorted (only root is min/max)

EXAMPLES: Task scheduler, Dijkstra's shortest path, find K closest points, running median

Binary Search Tree (BST) / Balanced Tree

IF YOU SEE: Need sorted order + fast lookup, range queries, predecessor/successor queries, sorted iteration

USE: Balanced BST (Red-Black Tree, AVL Tree) or TreeMap/TreeSet

WHY: O(log n) insert/search/delete + maintains sorted order. Supports range queries.

TRADE-OFF: Slower than hash table for simple lookups, more complex implementation

EXAMPLES: Database indexes (B-Tree), in-memory sorted collections, interval trees

B-Tree / B+ Tree

IF YOU SEE: Disk-based storage, database indexes, need to minimize disk I/O, range scans on large datasets

USE: B-Tree or B+ Tree

WHY: High branching factor minimizes tree height → fewer disk reads. Leaf linking enables efficient range scans.

TRADE-OFF: More complex than BST, overhead for small datasets

EXAMPLES: PostgreSQL/MySQL indexes, filesystem directories (ext4, NTFS), key-value stores

Trie (Prefix Tree)

IF YOU SEE: Autocomplete, prefix matching, spell checking, IP routing (longest prefix match), word search in dictionary

USE: Trie

WHY: O(k) lookup where k = key length. Shared prefixes save space. Natural for prefix queries.

TRADE-OFF: High memory overhead (many pointers), only useful for string/sequence data

EXAMPLES: Search autocomplete, DNS lookup, phone directory, word games

Graph (Adjacency List)

IF YOU SEE: Relationships between entities, networks, paths/routes, dependencies, social connections, sparse connections

USE: Adjacency List

WHY: O(V + E) space, efficient for sparse graphs (most real-world graphs), fast neighbor iteration

TRADE-OFF: O(degree) to check if edge exists, slower than matrix for dense graphs

EXAMPLES: Social networks, road maps, web crawling, dependency resolution

Bloom Filter

IF YOU SEE: "Probably exists" checks, avoiding expensive lookups, membership testing at scale, can tolerate false positives but NOT false negatives

USE: Bloom Filter

WHY: O(k) lookup, extremely space-efficient (~10 bits per element for 1% false positive). "No" is definitive.

TRADE-OFF: False positives possible, cannot delete elements, cannot retrieve stored values

EXAMPLES: Database query optimization (avoid disk reads), malicious URL checking, CDN cache checks, spell checkers

Skip List

IF YOU SEE: Need sorted data with simpler implementation than balanced tree, concurrent access, probabilistic balance

USE: Skip List

WHY: O(log n) average operations, simpler than red-black trees, easy to make lock-free

TRADE-OFF: Probabilistic (not guaranteed O(log n)), more space than BST

EXAMPLES: Redis sorted sets, LevelDB/RocksDB memtables, concurrent data structures

Data Structure Quick Reference

Problem Pattern	Data Structure	Key Operations
O(1) key lookup	Hash Table	get, put, delete
Sorted data + range queries	BST / TreeMap	floor, ceiling, range
Min/max tracking	Heap	insert, extractMin/Max
LIFO / nested structures	Stack	push, pop, peek
FIFO / level-by-level	Queue	enqueue, dequeue
Prefix matching	Trie	insert, startsWith
Relationships / paths	Graph	addEdge, neighbors, BFS/DFS
Membership at scale	Bloom Filter	add, mightContain
Disk-based indexes	B-Tree	search, insert, range
O(1) insert + O(1) remove	HashMap + DLL (LRU)	get, put, evict

Algorithms

Binary Search

IF YOU SEE: Sorted array, "find position", "find first/last", monotonic function, "minimum/maximum that satisfies condition"

USE: Binary Search

WHY: O(log n) - eliminates half the search space each iteration

TRADE-OFF: Requires sorted data or monotonic property

EXAMPLES: Search in sorted array, find insert position, search rotated array, find peak element, capacity scheduling

Two Pointers

IF YOU SEE: Sorted array, find pairs summing to target, remove duplicates, partition array, palindrome check, container with most water

USE: Two Pointers (start/end or slow/fast)

WHY: O(n) single pass, O(1) space, exploits sorted property or positional relationships

TRADE-OFF: Usually requires sorted input or specific structure

EXAMPLES: Two sum (sorted), 3sum, remove duplicates, linked list cycle detection, merge sorted arrays

Sliding Window

IF YOU SEE: Contiguous subarray/substring, "maximum/minimum of size K", "longest substring with condition", stream processing

USE: Sliding Window (fixed or variable size)

WHY: O(n) - each element enters and exits window once, avoids recomputation

TRADE-OFF: Only works for contiguous sequences

EXAMPLES: Max sum subarray of size K, longest substring without repeating chars, minimum window substring

BFS (Breadth-First Search)

IF YOU SEE: Shortest path (unweighted), level-order traversal, "minimum steps", nearest neighbor, spreading/infection simulation

USE: BFS with Queue

WHY: Explores level by level, guarantees shortest path in unweighted graphs

TRADE-OFF: O(V + E) time, O(V) space for queue (can be large for wide graphs)

EXAMPLES: Shortest path in maze, word ladder, binary tree level order, rotten oranges, social network degrees

DFS (Depth-First Search)

IF YOU SEE: Explore all paths, detect cycles, topological sort, connected components, backtracking, tree traversals (in/pre/post-order)

USE: DFS with Stack/Recursion

WHY: Natural for exhaustive search, uses O(height) stack space, simple recursive implementation

TRADE-OFF: Doesn't find shortest path, can stack overflow on deep graphs

EXAMPLES: Detect cycle, topological sort, number of islands, path sum, generate permutations

Dijkstra's Algorithm

IF YOU SEE: Shortest path with non-negative weights, single-source shortest path, network routing

USE: Dijkstra's Algorithm with Min-Heap

WHY: O((V + E) log V) with heap, optimal for graphs with non-negative weights

TRADE-OFF: Doesn't work with negative weights (use Bellman-Ford instead)

EXAMPLES: GPS navigation, network routing (OSPF), cheapest flights

Dynamic Programming

IF YOU SEE: Optimal substructure + overlapping subproblems, "minimum/maximum cost", "number of ways", "is it possible", sequence alignment

USE: Dynamic Programming (top-down memoization or bottom-up tabulation)

WHY: Avoids recomputation of subproblems, reduces exponential to polynomial time

TRADE-OFF: Extra space for memoization table, need to identify state and transitions

EXAMPLES: Fibonacci, knapsack, longest common subsequence, edit distance, coin change

Topological Sort

IF YOU SEE: Dependencies, prerequisites, build order, task scheduling with constraints, course schedule

USE: Topological Sort (Kahn's BFS or DFS)

WHY: O(V + E), produces valid ordering respecting dependencies

TRADE-OFF: Only works on DAGs (directed acyclic graphs)

EXAMPLES: Build systems (Make, Gradle), course scheduling, package dependencies

Union-Find (Disjoint Set)

IF YOU SEE: "Are X and Y connected?", grouping/clustering, connected components, cycle detection in undirected graph, Kruskal's MST

USE: Union-Find with path compression + union by rank

WHY: Nearly O(1) per operation (amortized), efficient for dynamic connectivity

TRADE-OFF: Can't efficiently split groups, only tracks connectivity (not paths)

EXAMPLES: Number of connected components, accounts merge, redundant connection

Greedy

IF YOU SEE: Local optimal choice leads to global optimal, interval scheduling, Huffman coding, "minimum number of X to cover Y"

USE: Greedy Algorithm

WHY: Simple, often O(n log n), makes locally optimal choices

TRADE-OFF: Must prove greedy choice property, doesn't always give optimal solution

EXAMPLES: Activity selection, Huffman encoding, minimum platforms, jump game

Backtracking

IF YOU SEE: Generate all combinations/permutations, constraint satisfaction (Sudoku, N-Queens), "find all solutions", decision tree exploration

USE: Backtracking (DFS + pruning)

WHY: Systematically explores solution space, prunes invalid branches early

TRADE-OFF: Can be exponential, effectiveness depends on pruning quality

EXAMPLES: N-Queens, Sudoku solver, generate parentheses, word search, subsets

Algorithm Selection Quick Reference

Problem Type	Algorithm	Time Complexity
Search in sorted data	Binary Search	O(log n)
Shortest path (unweighted)	BFS	O(V + E)
Shortest path (weighted, non-negative)	Dijkstra	O((V+E) log V)
Shortest path (negative weights)	Bellman-Ford	O(V × E)
All-pairs shortest path	Floyd-Warshall	O(V³)
Minimum spanning tree	Kruskal / Prim	O(E log E)
Contiguous subarray	Sliding Window	O(n)
Optimal substructure	Dynamic Programming	Varies (polynomial)
Dependencies/ordering	Topological Sort	O(V + E)
Dynamic connectivity	Union-Find	O(α(n)) ≈ O(1)
All combinations/permutations	Backtracking	O(2ⁿ) or O(n!)

Rate Limiting

Token Bucket

IF YOU SEE: Need to allow bursts, smooth out traffic, API rate limiting with burst capacity, network traffic shaping

USE: Token Bucket Algorithm

WHY: Allows temporary bursts up to bucket capacity while maintaining average rate. Tokens refill at constant rate.

TRADE-OFF: Allows bursts (might not want this), need to tune bucket size and refill rate

EXAMPLES: AWS API Gateway, network traffic shaping, Stripe API rate limiting

Bucket: [●●●●●○○○○○]  capacity=10, tokens=5
Request → take token → [●●●●○○○○○○]
Refill every 100ms → [●●●●●○○○○○]
Burst of 5 requests → [○○○○○○○○○○] (allowed!)
Next request → DENIED (no tokens)

Leaky Bucket

IF YOU SEE: Need constant output rate, smoothing bursty input, network congestion control, queue-based processing

USE: Leaky Bucket Algorithm

WHY: Processes requests at constant rate regardless of input rate. Smooths traffic perfectly.

TRADE-OFF: Doesn't allow bursts, requests may wait in queue, can drop requests if queue full

EXAMPLES: Network traffic shaping, video streaming bitrate control

Input (bursty): ●●●●●...........●●●●●
Leaky Bucket:   ●.●.●.●.●.●.●.●.●.●.●
Output:         Constant rate processing

Fixed Window Counter

IF YOU SEE: Simple rate limiting, "X requests per minute/hour", don't need precision at window boundaries

USE: Fixed Window Counter

WHY: Simple to implement, O(1) space per user, easy to understand

TRADE-OFF: Boundary problem - 2x burst possible at window edges (100 at :59, 100 at :00)

EXAMPLES: Basic API rate limiting, simple request counters

Window: [00:00 - 01:00] limit=100
Requests at 00:59: 100 (allowed)
Requests at 01:00: 100 (allowed - new window!)
= 200 requests in 2 seconds (boundary burst)

Sliding Window Log

IF YOU SEE: Need precise rate limiting, no boundary bursts, exact "last N minutes" counting

USE: Sliding Window Log (store timestamps)

WHY: Most accurate - stores timestamp of each request, counts requests in exact sliding window

TRADE-OFF: O(n) space per user (stores all timestamps), expensive for high-volume

EXAMPLES: High-precision rate limiting, security-critical limits

Sliding Window Counter

IF YOU SEE: Need smooth rate limiting without boundary bursts, memory-efficient, good approximation

USE: Sliding Window Counter (weighted average of windows)

WHY: Combines fixed window simplicity with sliding window smoothness. O(1) space.

TRADE-OFF: Approximation (not exact), slightly more complex than fixed window

EXAMPLES: Most production rate limiters (Redis-based), Cloudflare

Current window: 70% through
Previous window count: 40
Current window count: 30
Weighted: 40 × 0.3 + 30 = 42 requests in sliding window

Rate Limiting Selection

Requirement	Algorithm	Memory
Allow bursts, smooth average	Token Bucket	O(1)
Constant output rate	Leaky Bucket	O(queue size)
Simple, approximate	Fixed Window	O(1)
Precise, no bursts	Sliding Window Log	O(requests)
Smooth, memory-efficient	Sliding Window Counter	O(1)

Caching Strategies

Cache-Aside (Lazy Loading)

IF YOU SEE: Read-heavy workload, can tolerate cache miss latency, data that's expensive to compute/fetch

USE: Cache-Aside Pattern

WHY: Only caches data that's actually requested, application controls caching logic

TRADE-OFF: Cache miss is slow (DB read + cache write), potential stale data, cache stampede risk

EXAMPLES: User profile caching, product catalog, session storage

Read:
1. Check cache → HIT? Return
2. MISS? Read from DB
3. Write to cache
4. Return data

Write-Through

IF YOU SEE: Need strong consistency, can't tolerate stale reads, write latency acceptable

USE: Write-Through Cache

WHY: Cache always consistent with database, no stale reads

TRADE-OFF: Higher write latency (write to both cache + DB), may cache unused data

EXAMPLES: User settings, shopping cart, inventory counts

Write:
1. Write to cache
2. Cache writes to DB (synchronously)
3. Return success

Write-Behind (Write-Back)

IF YOU SEE: Write-heavy workload, can tolerate eventual consistency, need low write latency

USE: Write-Behind Cache

WHY: Very fast writes (only to cache), batches DB writes, reduces DB load

TRADE-OFF: Risk of data loss if cache fails before DB write, complexity, eventual consistency

EXAMPLES: Gaming leaderboards, analytics counters, activity feeds

Write:
1. Write to cache → Return immediately
2. Background: Batch writes to DB periodically

Read-Through

IF YOU SEE: Want cache to manage DB reads, simpler application code, read-heavy

USE: Read-Through Cache

WHY: Cache handles DB reads transparently, application only talks to cache

TRADE-OFF: First request always slow, cache must understand data model

EXAMPLES: ORM-level caching, CDN origin fetching

LRU (Least Recently Used)

IF YOU SEE: Limited cache size, recent items more likely to be accessed, general-purpose eviction

USE: LRU Eviction Policy

WHY: Keeps frequently accessed items, good for most workloads, O(1) with HashMap + DLL

TRADE-OFF: Can evict items about to be reused (scan resistance), doesn't consider frequency

EXAMPLES: Redis (approximated LRU), Memcached, CPU caches, browser cache

LFU (Least Frequently Used)

IF YOU SEE: Some items much more popular than others, want to keep hot items, frequency matters more than recency

USE: LFU Eviction Policy

WHY: Keeps popular items in cache, better hit rate when popularity varies significantly

TRADE-OFF: Slow to adapt to changing popularity, more complex than LRU

EXAMPLES: CDN caching for viral content, DNS caching

TTL (Time-To-Live)

IF YOU SEE: Data has known staleness tolerance, need automatic expiration, time-based invalidation

USE: TTL-based Expiration

WHY: Simple, automatic invalidation, bounds staleness

TRADE-OFF: May evict still-valid data, need to choose TTL carefully

EXAMPLES: Session tokens, API responses, DNS records

Caching Strategy Selection

Workload	Strategy	Consistency
Read-heavy, occasional writes	Cache-Aside	Eventual
Must be consistent	Write-Through	Strong
Write-heavy, latency-sensitive	Write-Behind	Eventual
General eviction	LRU	N/A
Hot/cold data	LFU	N/A
Known staleness tolerance	TTL	Time-bounded

Database Selection

Relational Database (PostgreSQL, MySQL)

IF YOU SEE: ACID transactions needed, complex queries with JOINs, structured data with relationships, data integrity critical

USE: Relational Database (PostgreSQL, MySQL)

WHY: ACID guarantees, powerful SQL queries, mature tooling, enforced schema

TRADE-OFF: Harder to scale horizontally, schema changes can be painful, vertical scaling limits

EXAMPLES: Financial systems, e-commerce orders, user accounts, inventory management

Document Database (MongoDB, CouchDB)

IF YOU SEE: Flexible/evolving schema, document-oriented data, rapid development, denormalized data

USE: Document Database (MongoDB)

WHY: Schema flexibility, horizontal scaling, JSON-native, fast development iteration

TRADE-OFF: No JOINs (denormalize), eventual consistency (often), data duplication

EXAMPLES: Content management, user profiles, product catalogs, event logging

Key-Value Store (Redis, DynamoDB)

IF YOU SEE: Simple key → value access, extreme performance needed, caching, session storage, real-time data

USE: Key-Value Store (Redis, DynamoDB)

WHY: O(1) operations, sub-millisecond latency (Redis), massive horizontal scale (DynamoDB)

TRADE-OFF: No complex queries, limited data modeling, no relationships

EXAMPLES: Session storage, caching, rate limiting counters, leaderboards

Wide-Column Store (Cassandra, HBase)

IF YOU SEE: Write-heavy workload, time-series data, high availability priority, known query patterns, massive scale

USE: Wide-Column Store (Cassandra, HBase)

WHY: Optimized for writes, horizontal scaling, tunable consistency, great for time-series

TRADE-OFF: Limited query flexibility, must design for access patterns, eventual consistency

EXAMPLES: IoT sensor data, messaging apps (Discord), analytics, activity logs

Graph Database (Neo4j, Neptune)

IF YOU SEE: Highly connected data, relationship queries, traversals, "friends of friends", recommendation engines

USE: Graph Database (Neo4j)

WHY: Native graph storage, efficient traversals, intuitive for relationship-heavy domains

TRADE-OFF: Not good for non-graph queries, scaling can be complex, smaller ecosystem

EXAMPLES: Social networks, fraud detection, knowledge graphs, recommendation engines

Search Engine (Elasticsearch)

IF YOU SEE: Full-text search, fuzzy matching, faceted search, log aggregation, relevance scoring

USE: Search Engine (Elasticsearch, Solr)

WHY: Inverted index for fast text search, relevance ranking, aggregations, near real-time

TRADE-OFF: Not a primary database, eventual consistency, complex to operate

EXAMPLES: Product search, log analysis (ELK stack), autocomplete

Time-Series Database (InfluxDB, TimescaleDB)

IF YOU SEE: Metrics, monitoring data, IoT sensors, timestamps are primary key, downsampling needed

USE: Time-Series Database

WHY: Optimized for time-based writes and queries, automatic downsampling, compression

TRADE-OFF: Limited for non-time-series queries, specialized use case

EXAMPLES: Application metrics, infrastructure monitoring, stock prices, IoT telemetry

Database Selection Quick Reference

Use Case	Database Type	Examples
Transactions, complex queries	Relational	PostgreSQL, MySQL
Flexible schema, documents	Document	MongoDB, CouchDB
Caching, sessions	Key-Value	Redis, Memcached
Massive scale, simple access	Key-Value (managed)	DynamoDB, Cassandra
Time-series, IoT	Wide-Column / TSDB	Cassandra, InfluxDB
Relationships, traversals	Graph	Neo4j, Neptune
Full-text search	Search Engine	Elasticsearch, Solr

Messaging & Queues

Message Queue (RabbitMQ, SQS)

IF YOU SEE: Task distribution, work queues, one consumer per message, decoupling services, job processing

USE: Message Queue (RabbitMQ, SQS)

WHY: Reliable delivery, message deleted after consumption, good for task distribution

TRADE-OFF: Messages consumed once, no replay, limited retention

EXAMPLES: Email sending, image processing, order fulfillment, background jobs

Event Streaming (Kafka)

IF YOU SEE: Event sourcing, multiple consumers need same events, replay needed, high throughput, log aggregation

USE: Event Streaming (Kafka, Kinesis)

WHY: Persistent log, multiple consumer groups, replay from any offset, massive throughput

TRADE-OFF: More complex to operate, consumers manage their offsets

EXAMPLES: Event sourcing, activity streams, log aggregation, real-time analytics, CDC

Pub/Sub (Redis, Google Pub/Sub)

IF YOU SEE: Broadcast to multiple subscribers, real-time notifications, fan-out pattern, fire-and-forget

USE: Pub/Sub System

WHY: Simple pub/sub semantics, broadcast to all subscribers, real-time

TRADE-OFF: No persistence (messages lost if no subscriber), at-most-once delivery

EXAMPLES: Chat room broadcasts, live updates, cache invalidation

Messaging Selection

Pattern	System	Delivery	Replay
Work queue (one consumer)	RabbitMQ, SQS	At-least-once	No
Event log (multi-consumer)	Kafka, Kinesis	At-least-once	Yes
Broadcast (fan-out)	Redis Pub/Sub	At-most-once	No
Managed, serverless	SQS, SNS, Pub/Sub	Configurable	Limited

Distributed Systems Patterns

Consistent Hashing

IF YOU SEE: Distributing data across nodes, need to add/remove nodes without reshuffling all data, load balancing

USE: Consistent Hashing

WHY: Only K/n keys move when adding/removing nodes (vs ALL keys with modulo hashing)

TRADE-OFF: More complex than modulo, need virtual nodes for even distribution

EXAMPLES: Distributed caches (Memcached), CDNs, database sharding, load balancers

Sharding

IF YOU SEE: Database too large for single server, need to scale writes, data can be partitioned by key

USE: Horizontal Sharding

WHY: Linear write scalability, each shard handles subset of data

TRADE-OFF: Cross-shard queries expensive, rebalancing complex, application complexity

EXAMPLES: Shard by user_id, tenant_id, geographic region

Leader Election

IF YOU SEE: Need single coordinator, distributed lock, scheduled job runner, avoiding split-brain

USE: Leader Election (Raft, Zookeeper, etcd)

WHY: Ensures single leader for coordination, handles leader failure

TRADE-OFF: Leader is bottleneck, failover takes time, complexity

EXAMPLES: Database primary election, job scheduler, distributed locks

Circuit Breaker

IF YOU SEE: Cascading failures, failing service calls, need to fail fast, protect downstream services

USE: Circuit Breaker Pattern

WHY: Stops calling failing service, allows recovery time, fails fast

TRADE-OFF: Need to tune thresholds, may reject valid requests during half-open state

EXAMPLES: Hystrix, Resilience4j, service mesh circuit breaking

States:
CLOSED → (failures > threshold) → OPEN
OPEN → (timeout) → HALF-OPEN
HALF-OPEN → (success) → CLOSED
HALF-OPEN → (failure) → OPEN

Saga Pattern

IF YOU SEE: Distributed transactions across services, need compensation on failure, long-running transactions

USE: Saga Pattern (choreography or orchestration)

WHY: Maintains consistency without distributed locks, each step has compensating action

TRADE-OFF: Eventually consistent, complex failure handling, need idempotent operations

EXAMPLES: Order → Payment → Inventory → Shipping (rollback on failure)

CQRS (Command Query Responsibility Segregation)

IF YOU SEE: Different read/write patterns, need to optimize reads and writes separately, complex queries on different model

USE: CQRS Pattern

WHY: Optimize read and write models independently, scale reads/writes separately

TRADE-OFF: Eventual consistency between models, more complex architecture

EXAMPLES: E-commerce (write orders, read product views), reporting systems

Event Sourcing

IF YOU SEE: Need complete audit trail, time-travel queries, rebuild state from events, domain-driven design

USE: Event Sourcing

WHY: Complete history, can replay events, natural audit log

TRADE-OFF: Eventual consistency, event schema evolution, higher storage

EXAMPLES: Banking transactions, version control, shopping cart history

Search & Indexing

B-Tree Index

IF YOU SEE: Range queries, ordered access, equality + comparison operators, most database queries

USE: B-Tree Index (default in most databases)

WHY: Efficient for both equality and range queries, sorted order

TRADE-OFF: Slower than hash for pure equality, write overhead to maintain

EXAMPLES: WHERE date BETWEEN x AND y, ORDER BY, WHERE status = 'active'

Hash Index

IF YOU SEE: Only equality queries, no range queries needed, exact key lookup

USE: Hash Index

WHY: O(1) lookup for equality, slightly faster than B-Tree for exact matches

TRADE-OFF: Cannot do range queries, ORDER BY, or partial matching

EXAMPLES: WHERE id = 123, primary key lookups

Inverted Index

IF YOU SEE: Full-text search, "find documents containing word X", text search

USE: Inverted Index (Elasticsearch, full-text search)

WHY: Maps terms → documents, enables fast text search

TRADE-OFF: Storage overhead, index maintenance, eventual consistency

EXAMPLES: Search engines, document search, log analysis

Bitmap Index

IF YOU SEE: Low cardinality columns, analytics queries, many AND/OR conditions

USE: Bitmap Index

WHY: Efficient for low cardinality, fast bitwise operations for complex conditions

TRADE-OFF: Poor for high cardinality, write-heavy workloads

EXAMPLES: WHERE status IN ('A', 'B') AND region = 'US', data warehouses

GIN Index (Generalized Inverted)

IF YOU SEE: JSONB queries, array contains, full-text search in PostgreSQL

USE: GIN Index

WHY: Handles composite values (arrays, JSONB), efficient containment queries

TRADE-OFF: Slower writes, larger index size

EXAMPLES: WHERE tags @> '{"featured"}', WHERE data->>'type' = 'x'

Real-Time Communication

Polling

IF YOU SEE: Simple requirements, infrequent updates, can tolerate delay, legacy systems

USE: Polling (Short Polling)

WHY: Simple to implement, works everywhere, no special server support

TRADE-OFF: Wasteful (many empty responses), higher latency, server load

EXAMPLES: Checking for email, simple status updates

Long Polling

IF YOU SEE: Need lower latency than polling, can't use WebSockets, moderate real-time needs

USE: Long Polling

WHY: Server holds request until data available, lower latency than polling

TRADE-OFF: Ties up connections, more complex than polling, still one-way initiated

EXAMPLES: Chat applications (legacy), notifications

Server-Sent Events (SSE)

IF YOU SEE: Server → client only, live feeds, notifications, simple real-time updates

USE: Server-Sent Events (SSE)

WHY: Simple, auto-reconnect, event IDs for resumption, HTTP-based

TRADE-OFF: One-way only (server to client), limited browser connections per domain

EXAMPLES: Live scores, stock tickers, social media feeds, notifications

WebSockets

IF YOU SEE: Bidirectional real-time, chat, gaming, collaborative editing, low latency critical

USE: WebSockets

WHY: Full-duplex, low overhead after handshake, true real-time

TRADE-OFF: More complex, need sticky sessions or pub/sub for scaling, firewall issues

EXAMPLES: Chat apps, multiplayer games, collaborative docs, live trading

Real-Time Selection

Requirement	Technology	Direction
Simple, tolerates delay	Polling	Client → Server
Lower latency, fallback	Long Polling	Client → Server
Server push, simple	SSE	Server → Client
Bidirectional, low latency	WebSockets	Both directions

Consistency Models

Strong Consistency

IF YOU SEE: Financial transactions, inventory counts, can't tolerate stale reads, read-after-write must work

USE: Strong Consistency (synchronous replication, consensus)

WHY: All reads see latest write, linearizable operations

TRADE-OFF: Higher latency, reduced availability during partitions (CAP theorem)

EXAMPLES: Bank transfers, ticket booking, primary database

Eventual Consistency

IF YOU SEE: High availability priority, can tolerate temporary staleness, global distribution

USE: Eventual Consistency

WHY: Higher availability, lower latency, better for geo-distribution

TRADE-OFF: May read stale data, conflict resolution needed, harder to reason about

EXAMPLES: Social media likes, DNS, shopping cart, user preferences

Causal Consistency

IF YOU SEE: Need to preserve cause-effect ordering, comments after posts, replies in order

USE: Causal Consistency (vector clocks, logical timestamps)

WHY: Preserves causality without full strong consistency overhead

TRADE-OFF: More complex than eventual, less available than eventual

EXAMPLES: Comment threads, collaborative editing, messaging

Read-Your-Writes Consistency

IF YOU SEE: User must see their own updates immediately, profile updates, settings changes

USE: Read-Your-Writes (session consistency, read from leader)

WHY: User sees their changes, acceptable for others to see eventual state

TRADE-OFF: May need sticky sessions, read from leader increases load

EXAMPLES: Profile updates, post submission, settings changes

Consistency Selection

Requirement	Model	CAP Trade-off
Must see latest data	Strong (CP)	Sacrifice availability
High availability priority	Eventual (AP)	Sacrifice consistency
Preserve causality	Causal	Middle ground
User sees own writes	Read-Your-Writes	Per-user consistency

Quick Decision Flowchart

Data Structure Selection

Need O(1) lookup by key? → Hash Table
Need sorted data? → BST / TreeMap
Need min/max quickly? → Heap
Need LIFO? → Stack
Need FIFO? → Queue
Need prefix matching? → Trie
Need relationships? → Graph
Need membership test at scale? → Bloom Filter

Database Selection

Need ACID transactions? → RDBMS (PostgreSQL)
Need flexible schema? → Document DB (MongoDB)
Need extreme speed, simple access? → Key-Value (Redis)
Need massive write throughput? → Wide-Column (Cassandra)
Need relationship queries? → Graph DB (Neo4j)
Need full-text search? → Search Engine (Elasticsearch)

Caching Strategy

Read-heavy, can tolerate miss? → Cache-Aside
Must be consistent? → Write-Through
Write-heavy, eventual OK? → Write-Behind
General eviction? → LRU
Hot/cold data? → LFU

Rate Limiting

Allow bursts? → Token Bucket
Constant rate output? → Leaky Bucket
Simple, approximate? → Fixed Window
Smooth, memory-efficient? → Sliding Window Counter

Real-Time Communication

Server → Client only? → SSE
Bidirectional needed? → WebSockets
Simple, tolerates delay? → Polling
Fallback for WebSocket? → Long Polling