ISP Services & Internet Infrastructure

Why DNS Was Invented (1983)

The Problem

Pre-DNS (1970s-1983): HOSTS.TXT file maintained by SRI-NIC (Stanford Research Institute). Every computer downloaded this file to resolve names.

Problems:

Single point of failure (SRI server)
Doesn't scale (file grew too large, downloads every night)
No namespace management (name conflicts)
Manual updates (email changes to SRI admin)

Paul Mockapetris' Solution (1983): Hierarchical, distributed database. No single server knows everything. Delegation enables scalability.

How DNS Works - The Complete Flow

sequenceDiagram participant User as Browser participant Resolver as ISP DNS Resolver participant Root as Root Server (.) participant TLD as TLD Server (.com) participant Auth as Authoritative (example.com) User->>Resolver: Resolve www.example.com Note over Resolver: Check cache (miss) Resolver->>Root: Query www.example.com? Root->>Resolver: Ask .com server (192.5.6.30) Resolver->>TLD: Query www.example.com? TLD->>Resolver: Ask example.com NS (ns1.example.com, 93.184.216.34) Resolver->>Auth: Query www.example.com? Auth->>Resolver: A record: 93.184.216.34, TTL=3600 Resolver->>User: 93.184.216.34 (cache for 1 hour)

Step-by-Step Explanation:

User types www.example.com: Browser asks OS resolver (or configured DNS like 8.8.8.8)
Recursive resolver: Your ISP's DNS server (or public DNS). Does the heavy lifting.
Query Root Server: 13 root servers (a.root-servers.net through m.root-servers.net, actually hundreds of anycast instances). Root doesn't know example.com, but knows who handles .com
Query TLD Server: .com nameservers (managed by Verisign). Don't know example.com, but know its authoritative nameservers
Query Authoritative NS: example.com's nameservers (might be AWS Route53, Cloudflare, or self-hosted). Has the actual record
Return Answer: Resolver caches result (TTL=3600 = 1 hour), returns to user

DNS Record Types

Record Type	Purpose	Example
A	IPv4 address	example.com → 93.184.216.34
AAAA	IPv6 address	example.com → 2606:2800:220:1:248:1893:25c8:1946
CNAME	Canonical name (alias)	www.example.com → example.com
MX	Mail exchanger	example.com → mail.example.com (priority 10)
NS	Nameserver	example.com → ns1.example.com
TXT	Arbitrary text (SPF, DKIM, verification)	"v=spf1 include:_spf.google.com ~all"
PTR	Reverse DNS (IP → name)	34.216.184.93.in-addr.arpa → example.com
SOA	Start of Authority (zone metadata)	Primary NS, admin email, serial, refresh times
SRV	Service location	_sip._tcp.example.com → sipserver.example.com:5060

DNS Caching & TTL

TTL (Time To Live) = How long to cache a record

Example: example.com A record, TTL=3600 (1 hour)

  T=0: First query, resolver asks authoritative, caches result
  T=30min: Second query, resolver returns cached result (fast!)
  T=70min: TTL expired, resolver queries authoritative again

Short TTL (60-300 sec): For records that change often (CDNs, failover)
Long TTL (86400 = 1 day): For stable records (reduces query load)

Trade-off: Short TTL = more queries but faster updates
           Long TTL = fewer queries but slow propagation of changes

DNS Protocol

UDP Port 53 (queries/responses)
TCP Port 53 (zone transfers, responses > 512 bytes)

Why UDP? Fast for small queries. Single request/response.
Why TCP fallback? Large responses (DNSSEC, many records), zone transfers (AXFR).

DNS Message Format:
  Header: ID, flags (query/response, recursion desired, authoritative)
  Question: What are you asking? (www.example.com, type A)
  Answer: The actual records
  Authority: NS records for the domain
  Additional: "Glue records" (A records for NS servers)

Glue Records - Solving the Chicken-and-Egg Problem

Problem:
  example.com NS → ns1.example.com
  To resolve example.com, need ns1.example.com's IP
  But ns1.example.com is IN example.com (circular dependency!)

Solution: Glue Records
  Parent (.com server) includes A record for ns1.example.com
  in "Additional" section when returning NS records

  Query: example.com
  Answer from .com server:
    Authority: ns1.example.com
    Additional: ns1.example.com → 93.184.216.1 (glue record)

DNS Security Issues & DNSSEC

DNS Vulnerabilities:

Cache Poisoning: Attacker injects fake records into resolver's cache
Man-in-the-Middle: Intercept and modify DNS responses
No authentication: Original DNS has no way to verify responses are legitimate

DNSSEC (DNS Security Extensions):

Cryptographically signs DNS records
Chain of trust from root to TLD to domain
Resolvers can verify authenticity
Downside: Larger responses, more complexity, not widely adopted

Modern DNS: DoH & DoT

DoT (DNS over TLS): Encrypts DNS queries over port 853. ISP can still see you're doing DNS.
DoH (DNS over HTTPS): Encrypts DNS inside HTTPS (port 443). ISP can't distinguish from web traffic.
Why it matters: Privacy. Traditional DNS is plaintext - ISP sees every site you visit. DoH/DoT prevent that.

Why DHCP Was Invented (1993)

The Problem: Manually configuring IP, subnet mask, gateway, DNS on every device doesn't scale. Dialup/broadband ISPs needed to assign IPs dynamically to thousands of users.

Predecessor: BOOTP (Bootstrap Protocol) - simpler, no automatic reclamation of addresses.

How DHCP Works: DORA Process

sequenceDiagram participant Client participant Server as DHCP Server Note over Client,Server: DORA Process Client->>Server: DISCOVER (broadcast: who has IPs?) Server->>Client: OFFER (I have 192.168.1.100 for you) Client->>Server: REQUEST (I want 192.168.1.100) Server->>Client: ACK (OK, it's yours for 24 hours) Note over Client: Configured! IP, Gateway, DNS, etc.

Step-by-Step:

DISCOVER: Client broadcasts (255.255.255.255) "I need an IP!" Uses UDP port 67 (server), 68 (client)
OFFER: DHCP server(s) respond with available IP and config
REQUEST: Client chooses one offer (if multiple servers), broadcasts acceptance
ACK: Server confirms, client configures interface

DHCP Options - More Than Just IP

Option 1: Subnet Mask (255.255.255.0)
Option 3: Default Gateway (192.168.1.1)
Option 6: DNS Servers (8.8.8.8, 8.8.4.4)
Option 15: Domain Name (example.com)
Option 42: NTP Servers
Option 66: TFTP Server (for phone/cable modem config)
Option 150: Cisco TFTP Server
Option 51: Lease Time (86400 seconds = 24 hours)

You used Option 66 for cable modem head-ends!
  Modem boots, DHCP gives IP + TFTP server
  Modem downloads config file from TFTP
  Modem registers with CMTS

DHCP Relay Agent

Problem: DHCP uses broadcast, doesn't cross routers

Solution: DHCP Relay (ip helper-address)
  Client broadcasts DISCOVER
  Router receives it, converts to unicast
  Router forwards to DHCP server on different subnet
  Server's response relayed back to client

Why it matters: One central DHCP server can serve many subnets

Lease Management

Lease Time: How long client can use IP (typical: 24 hours - 7 days)

T1 (50% of lease): Client tries to renew with original server
T2 (87.5% of lease): If no response, client broadcasts renewal to any server
Lease expires: Client must stop using IP, restart DORA

Why leases matter: Reclaims IPs from devices that left network
  Without leases, IP pool exhaustion (especially dialup era)

How Email Works - The Complete Journey

sequenceDiagram participant Sender as alice@company.com participant SendMTA as company.com MTA participant DNS participant RecvMTA as gmail.com MTA participant Mailbox as Gmail Mailbox participant Recipient as bob@gmail.com Sender->>SendMTA: SMTP (port 587): Send email to bob@gmail.com SendMTA->>DNS: MX lookup for gmail.com DNS->>SendMTA: MX: gmail-smtp-in.l.google.com (priority 5) SendMTA->>RecvMTA: SMTP (port 25): Deliver email RecvMTA->>Mailbox: Store in bob's mailbox Recipient->>Mailbox: POP3/IMAP: Retrieve email Mailbox->>Recipient: Email delivered

SMTP: Simple Mail Transfer Protocol

How SMTP Works:

Sender → Sending MTA (Mail Transfer Agent) → Receiving MTA → Mailbox

SMTP Conversation:
  Client: EHLO company.com
  Server: 250-gmail-smtp-in.l.google.com
  Server: 250-SIZE 35882577
  Server: 250 STARTTLS

  Client: MAIL FROM:
  Server: 250 OK

  Client: RCPT TO:
  Server: 250 OK

  Client: DATA
  Server: 354 Start mail input

  Client: From: alice@company.com
  Client: To: bob@gmail.com
  Client: Subject: Meeting tomorrow
  Client:
  Client: Hi Bob, let's meet at 10am.
  Client: .
  Server: 250 OK Message accepted

  Client: QUIT
  Server: 221 Bye

SMTP Ports:

Port 25: Server-to-server (MTA to MTA). Often blocked by ISPs (spam prevention)
Port 587: Submission (client to MTA). Requires authentication. Modern standard.
Port 465: SMTPS (SMTP over SSL). Deprecated but still used.

MX Records & Mail Routing

MX Record: Specifies mail server for domain

$ dig gmail.com MX
gmail.com.  3600  IN  MX  5 gmail-smtp-in.l.google.com.
gmail.com.  3600  IN  MX  10 alt1.gmail-smtp-in.l.google.com.
gmail.com.  3600  IN  MX  20 alt2.gmail-smtp-in.l.google.com.

Lower priority number = higher priority
Try 5 first, if down, try 10, then 20 (fallback/redundancy)

SPF, DKIM, DMARC - Fighting Spam & Spoofing

SPF (Sender Policy Framework): TXT record listing authorized sending IPs

example.com TXT "v=spf1 ip4:192.0.2.0/24 include:_spf.google.com ~all"

Meaning: Emails from example.com should come from:
  - 192.0.2.0/24
  - Google's servers (G Suite/Workspace)
  - ~all = softfail (suspicious but don't reject)

DKIM (DomainKeys Identified Mail): Cryptographic signature in email headers

Sending server signs email with private key
Receiving server verifies with public key (in DNS TXT record)
Proves email hasn't been tampered with

DMARC (Domain-based Message Authentication): Policy for SPF/DKIM failures

example.com TXT "v=DMARC1; p=reject; rua=mailto:dmarc@example.com"

p=reject: Reject emails that fail SPF and DKIM
p=quarantine: Mark as spam
p=none: Just monitor (rua = aggregate reports)

POP3 vs IMAP

Feature	POP3 (Port 110/995)	IMAP (Port 143/993)
Email Storage	Downloads to client, deletes from server (default)	Stays on server, synced to clients
Multiple Devices	Poor (email on one device only)	Excellent (sync across all devices)
Folders	Local only	Server-side folders, synced
Offline Access	Yes (email is local)	Depends on client caching
Bandwidth	Downloads entire mailbox	Downloads headers first, body on demand
Server Storage	Minimal (client stores email)	High (server stores all email)

Modern Usage: IMAP dominates (Gmail, Outlook, etc.). POP3 mostly obsolete except for legacy systems.

FTP: File Transfer Protocol

How FTP Works:

Two Channels: Control (port 21) + Data (port 20 or ephemeral)
Active vs Passive Mode

Active FTP (Original, Problematic)

Control: Client → Server port 21
Data: Server → Client (server initiates!)

Problem: Firewalls block incoming connections to clients
  Client behind NAT/firewall can't receive server's data connection

Passive FTP (Modern, Firewall-Friendly)

Control: Client → Server port 21
Client: PASV command
Server: Responds with IP:Port (e.g., 192.0.2.1:51234)
Data: Client → Server port 51234 (client initiates)

Why it matters: Works through firewalls/NAT (client initiates both connections)

FTP Security Issues

Plaintext credentials: Username/password sent unencrypted
Plaintext data: Files transferred unencrypted
Solution: FTPS (FTP over TLS) or SFTP (SSH File Transfer Protocol, completely different protocol)

TFTP: Trivial File Transfer Protocol

Why TFTP Exists:

Simple: No authentication, no directory browsing, UDP-based
Small: Fits in boot ROM (for diskless workstations, network booting)
Your use case: Cable modem configuration files!

How TFTP Works

UDP Port 69

Read Request (RRQ): Client requests file
Data: Server sends 512-byte blocks
ACK: Client acknowledges each block
Last block < 512 bytes signals end

No authentication, no encryption
Used for: PXE boot, network device configs (routers, switches, cable modems)

Cable Modem Head-End Workflow (Your Experience):

1. Cable modem boots, no config
2. DHCP: Modem gets IP, gateway, DNS, TFTP server (Option 66)
3. TFTP: Modem downloads config file from TFTP server
   - Contains: upload/download speeds, QoS settings, etc.
4. Registration: Modem registers with CMTS (Cable Modem Termination System)
5. Online: Modem ready for customer use

TFTP perfect for this: Simple, fast, doesn't require complex TCP stack in modem firmware

LDAP: Lightweight Directory Access Protocol

Why LDAP Exists:

Centralized authentication: Single source of truth for users/groups
Hierarchical data: Organized like a tree (companies, departments, users)
Read-optimized: Queries are fast, updates are rare

LDAP Structure (DIT: Directory Information Tree)

dc=example,dc=com (root)
  ├─ ou=people
  │   ├─ cn=John Doe,ou=people,dc=example,dc=com
  │   └─ cn=Jane Smith,ou=people,dc=example,dc=com
  └─ ou=groups
      ├─ cn=engineers,ou=groups,dc=example,dc=com
      └─ cn=sales,ou=groups,dc=example,dc=com

Components:
  dc = domain component (example.com → dc=example,dc=com)
  ou = organizational unit (departments, containers)
  cn = common name (users, groups)
  dn = distinguished name (full path to object)

LDAP Operations

Bind: Authenticate (login)
Search: Query directory (filter: uid=jdoe, ou=people)
Compare: Check if attribute matches value
Add/Modify/Delete: Update directory

Active Directory - Microsoft's LDAP Implementation

AD = LDAP + Kerberos + DNS + SMB + Group Policy

Domain Controllers: Servers that store AD database
Kerberos: Authentication protocol (tickets instead of passwords)
Global Catalog: Subset of AD replicated to all DCs (fast cross-domain lookups)
Group Policy: Centralized configuration management (deploy software, enforce settings)

Why AD Dominates Enterprise: Integrates authentication, authorization, and configuration management. Single pane of glass for IT admins.

Why Caching Matters

The Problem: Databases, APIs, and origin servers are slow and expensive. Serving every request from source doesn't scale.

The Solution: Cache frequently accessed data closer to users. Trade-off: Freshness vs Performance.

Caching Hierarchy

Layer	Location	Latency	Use Case
Browser Cache	Client	~1ms	Static assets (CSS, JS, images)
CDN (Edge Cache)	Globally distributed	10-50ms	Static content, streaming video
Reverse Proxy (Varnish, Nginx)	In front of app servers	1-5ms	Full page cache, API responses
Application Cache (Redis, Memcached)	Same datacenter as app	1-10ms	Session data, query results
Database Query Cache	Database server	10-100ms	Repeated queries

Cache Strategies

1. Cache-Aside (Lazy Loading):

Application checks cache first:
  Hit: Return cached data
  Miss: Query database, store in cache, return data

Best for: Read-heavy workloads, data that changes infrequently

Trade-off: First request always hits database (cold cache)

2. Write-Through:

Write to cache AND database simultaneously:
  Application writes data
  → Write to cache
  → Write to database
  → Return success

Best for: Data that's read immediately after write

Trade-off: Write latency (waiting for both cache and DB)

3. Write-Behind (Write-Back):

Write to cache immediately, database later:
  Application writes data
  → Write to cache (fast!)
  → Async job writes to database later
  → Return success

Best for: Write-heavy workloads, acceptable data loss risk

Trade-off: Data loss if cache crashes before DB write

4. Refresh-Ahead:

Proactively refresh cache before expiration:
  Cache entry has TTL
  Before expiration, background job refreshes from DB
  Avoids cache miss latency

Best for: Predictable access patterns, expensive queries

Trade-off: Wastes resources refreshing unused data

When to Use Which Technology

Technology	Best For	Not Good For
Redis	Session storage, pub/sub, leaderboards, real-time analytics. Supports complex data structures (lists, sets, sorted sets).	Large objects (> 1MB), durable storage (primarily in-memory)
Memcached	Simple key-value cache, multi-threaded (better CPU utilization), lower memory overhead	Complex data structures, persistence, pub/sub
CDN (Cloudflare, AWS CloudFront)	Static assets, images, videos, API responses (with proper cache headers), global users	User-specific data (unless you use edge computing), real-time data
Varnish	HTTP reverse proxy cache, full page cache, handling traffic spikes	Complex application logic, user-specific content (without ESI)
Browser Cache	Immutable assets (versioned CSS/JS), rarely-changing content	Dynamic content, personalized data

Cache Invalidation - The Hard Problem

"There are only two hard things in Computer Science: cache invalidation and naming things." - Phil Karlton

Strategies:

TTL-based: Data expires after time period. Simple but stale data possible.
Event-based: Invalidate cache on database write. Complex but fresh data.
Versioning: Include version in cache key (user:123:v2). Never invalidate, just stop using old version.
Cache tagging: Tag related entries, invalidate by tag (e.g., tag:user:123).

HTTP Cache Headers

Cache-Control: max-age=3600, public
  Browser/CDN can cache for 1 hour, shareable between users

Cache-Control: max-age=3600, private
  Only browser can cache (user-specific data)

Cache-Control: no-store
  Don't cache at all (sensitive data)

ETag: "abc123"
  Content hash. Browser sends If-None-Match, server returns 304 Not Modified if unchanged

Why Rate Limiting Exists

The Problem:

DoS attacks: Overwhelm service with requests
Scrapers: Abuse API, steal data
Noisy neighbors: One user hogs resources
Cost control: Third-party API bills per request

Rate Limiting Algorithms

1. Token Bucket (Most Common)

Bucket holds N tokens, refills at rate R tokens/second
Each request consumes 1 token
If tokens available: Allow request, decrement counter
If tokens = 0: Reject request (429 Too Many Requests)

Example: 100 tokens, refill 10/sec
  → Allows bursts of 100 requests
  → Sustained rate of 10 requests/sec

Implementation (Redis):
  INCR user:123:requests
  EXPIRE user:123:requests 60  (reset every minute)
  If counter > limit: Reject

2. Leaky Bucket

Requests enter bucket (queue) at any rate
Requests "leak" out at constant rate

Smooths traffic, no bursts allowed
Good for: QoS, traffic shaping
Bad for: Legitimate bursts (e.g., page load)

3. Fixed Window

Count requests in fixed time window (e.g., per minute)

Example: 100 requests per minute
  00:00 - 00:59 → 100 requests allowed
  01:00 - 01:59 → Counter resets

Problem: Burst at window boundary
  00:59 → 100 requests
  01:00 → 100 requests (200 in 1 second!)

Simplest to implement but least fair

4. Sliding Window

Track requests with timestamps, count within rolling window

Example: 100 requests per minute
  At 01:30, count requests from 00:30 - 01:30

More accurate than fixed window
More expensive (store timestamps, not just counter)

When to Use Which Algorithm

Algorithm	Best For	Trade-offs
Token Bucket	APIs, web services (allows bursts)	Most common, good balance
Leaky Bucket	Traffic shaping, QoS, video streaming	No bursts, can queue requests
Fixed Window	Simple rate limiting, low precision OK	Burst at boundaries, easy to implement
Sliding Window	High-precision rate limiting	More memory/CPU, better accuracy

Implementation Technologies

Redis (Recommended):

# Token bucket with Redis
SCRIPT:
  local key = KEYS[1]
  local limit = tonumber(ARGV[1])
  local window = tonumber(ARGV[2])

  local current = redis.call('INCR', key)
  if current == 1 then
    redis.call('EXPIRE', key, window)
  end

  if current > limit then
    return 0  -- Rate limited
  else
    return 1  -- Allowed
  end

# Sliding window with sorted sets
ZADD user:123:requests  
ZREMRANGEBYSCORE user:123:requests 0 
ZCARD user:123:requests  (if > limit: reject)

Application-Level (In-Memory):

Fast but doesn't scale across multiple servers
Use for single-instance apps or per-server limits

API Gateway (Kong, AWS API Gateway):

Centralized rate limiting for all APIs
Built-in features, easy to configure

Best Practices

Return proper headers:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1640000000
Retry-After: 60  (in 429 response)

Different limits for different tiers: Free (100/hour), Pro (1000/hour), Enterprise (unlimited)
Multiple dimensions: Per user, per IP, per API endpoint
Whitelist: Exempt trusted IPs/users from rate limiting

Types of Firewalls

1. Packet Filtering Firewall (Layer 3/4)

How it works: Examines IP header (source/dest IP) and TCP/UDP header (source/dest port). Allows or denies based on rules.

Example ACL (Access Control List):
  Rule 1: Allow TCP from 10.0.0.0/8 to any port 80 (HTTP)
  Rule 2: Allow TCP from any to 192.168.1.100 port 443 (HTTPS to web server)
  Rule 3: Allow UDP from any to 8.8.8.8 port 53 (DNS)
  Rule 4: Deny all

Pros: Fast, low overhead
Cons: No application awareness, easy to spoof source IP

2. Stateful Firewall

How it works: Tracks connection state (TCP handshake, established connections). Automatically allows reply traffic.

Connection Table:
  Source IP:Port | Dest IP:Port | State     | Timeout
  10.0.1.5:5000  | 93.184.2.1:80| ESTABLISHED | 3600
  10.0.1.6:5001  | 1.1.1.1:443  | SYN_SENT    | 60

Outbound SYN → Automatically allow SYN-ACK, ACK (return traffic)
Don't need explicit "allow inbound" rule for replies

Pros: More secure (tracks state), fewer rules needed
Cons: More memory/CPU (state table)

3. Application Layer Firewall (Layer 7)

How it works: Deep packet inspection (DPI). Understands HTTP, SMTP, FTP protocols. Can block based on URL, SQL injection patterns, etc.

Examples:
  Block HTTP requests with "SELECT * FROM" in URL (SQL injection)
  Block access to *.facebook.com
  Allow SMTP but block attachments > 10MB
  Inspect SSL/TLS traffic (decrypt, inspect, re-encrypt)

Pros: Blocks application-specific attacks
Cons: High CPU (decrypt, inspect), privacy concerns (TLS inspection)

4. Web Application Firewall (WAF)

Specialized for HTTP/HTTPS:

OWASP Top 10 protection (SQL injection, XSS, CSRF)
Rate limiting per URL
Bot detection
Geo-blocking

Examples: Cloudflare WAF, AWS WAF, ModSecurity

Access Control Lists (ACLs)

Standard ACL (Source IP Only):

access-list 10 permit 10.0.0.0 0.255.255.255
access-list 10 deny any

Applied to interface:
  interface GigabitEthernet0/0
  ip access-group 10 in

Extended ACL (Source/Dest IP, Ports, Protocol):

access-list 100 permit tcp 10.0.0.0 0.255.255.255 any eq 80
access-list 100 permit tcp 10.0.0.0 0.255.255.255 any eq 443
access-list 100 deny ip any any log

More granular control, can specify:
  - Protocol (TCP, UDP, ICMP)
  - Source/dest IP and subnet
  - Source/dest ports
  - Flags (SYN, ACK, etc.)

Firewall Best Practices

Default deny: Block everything, explicitly allow what's needed
Least privilege: Only allow minimum required access
Egress filtering: Control outbound traffic (prevents data exfiltration)
Log denied traffic: Monitor for attacks, misconfigurations
Regular reviews: Remove obsolete rules
Zone-based: DMZ (public-facing), Internal, Management (separate networks)

NAT vs Firewall

Common misconception: NAT provides security

Reality: NAT hides internal IPs but isn't a firewall. Once a connection is established (port forwarding, UPnP), NAT allows all traffic through. Need firewall rules for actual security.

Why SSL/TLS Was Invented

The Problem (Early Internet - 1990s)

HTTP was plaintext: Anyone on the network path could intercept usernames, passwords, credit cards, emails - everything.

Evolution:

1994: Netscape creates SSL 1.0 (never released due to security flaws)
1995: SSL 2.0 released (flawed, quickly deprecated)
1996: SSL 3.0 (widely adopted, but vulnerable to POODLE attack)
1999: TLS 1.0 (upgrade of SSL 3.0, standardized by IETF)
2006: TLS 1.1 (fixes CBC attacks)
2008: TLS 1.2 (modern standard, SHA-256, better cipher suites)
2018: TLS 1.3 (current standard, faster handshake, removed weak ciphers)

Today: TLS 1.2 and 1.3 are standard. SSL is deprecated but name stuck ("SSL certificate" really means TLS).

What SSL/TLS Provides

Security Goal	How TLS Achieves It
Encryption	Symmetric encryption (AES-256) for data transfer. Keys exchanged via asymmetric crypto (RSA, ECDHE).
Authentication	Server proves identity with certificate signed by trusted CA (Certificate Authority).
Integrity	HMAC (Hash-based Message Authentication Code) prevents tampering. Each message authenticated.

TLS Handshake (TLS 1.2) - How It Works

sequenceDiagram participant Client as Browser participant Server as Web Server Note over Client,Server: TLS 1.2 Handshake (4 round trips) Client->>Server: 1. ClientHello
(Supported cipher suites, TLS version, random) Server->>Client: 2. ServerHello
(Chosen cipher, server random, certificate) Note over Client: Verify certificate
(Check CA signature, expiry, hostname) Client->>Server: 3. ClientKeyExchange
(Pre-master secret, encrypted with server's public key) Note over Client,Server: Both derive session key from:
client random + server random + pre-master secret Client->>Server: 4. ChangeCipherSpec + Finished
(Switch to encrypted, verify handshake) Server->>Client: 5. ChangeCipherSpec + Finished Note over Client,Server: Encrypted Application Data Client->>Server: HTTP Request (encrypted) Server->>Client: HTTP Response (encrypted)

Step-by-Step Explanation:

ClientHello: Browser says "I support TLS 1.2, TLS 1.3, cipher suites X, Y, Z" + sends random nonce
ServerHello: Server picks TLS 1.2, cipher suite (e.g., TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384), sends certificate (contains server's public key) + random nonce
Certificate Verification: Browser checks:
- Is certificate signed by trusted CA? (Chain of trust)
- Is certificate expired?
- Does certificate match hostname? (example.com in cert vs URL)
- Has certificate been revoked? (OCSP/CRL check)
Key Exchange: Browser generates "pre-master secret", encrypts with server's public key (from certificate), sends to server. Only server's private key can decrypt it.
Session Key Derivation: Both sides derive symmetric encryption key from: client random + server random + pre-master secret (same on both sides, but never sent over network!)
Finished: Both send encrypted "Finished" message with hash of all handshake messages (proves nothing was tampered with)
Encrypted Communication: All HTTP data now encrypted with AES-256 using session key

TLS 1.3 Handshake - Faster (1-RTT)

TLS 1.3 improvements:
  - 1 round trip instead of 2 (faster)
  - Removed weak ciphers (RC4, SHA-1, MD5)
  - Forward secrecy required (ECDHE)
  - 0-RTT resumption for returning clients (instant)

Handshake:
  Client → Server: ClientHello + KeyShare (send public key immediately)
  Server → Client: ServerHello + Certificate + KeyShare + Finished
  (Encrypted application data can start immediately)

Why it's faster: Client sends key material in first message (speculative)
instead of waiting for server's certificate first

Certificate Chain of Trust

How browsers trust certificates:

Root CA (e.g., DigiCert Global Root)
  ↓ Signs
Intermediate CA (e.g., DigiCert TLS RSA SHA256 2020 CA1)
  ↓ Signs
Leaf Certificate (www.example.com)

Browser has ~100 Root CAs built-in (hardcoded trust store)
Server sends: Leaf + Intermediate certificates
Browser verifies:
  1. Leaf signed by Intermediate? ✓
  2. Intermediate signed by Root? ✓
  3. Root in browser's trust store? ✓
  → Chain validated, connection trusted

Why intermediates? Root CA private keys are kept offline (air-gapped, HSMs). Compromising root = disaster (every cert issued by that root becomes untrustworthy). Intermediates handle day-to-day signing.

Certificate Components

X.509 Certificate contains:
  - Subject: CN=www.example.com (who owns it)
  - Issuer: CN=DigiCert TLS RSA SHA256 2020 CA1 (who signed it)
  - Public Key: RSA 2048-bit or ECDSA P-256
  - Validity: Not Before / Not After (expiry date)
  - Serial Number: Unique identifier
  - Signature: Issuer's signature (proves certificate hasn't been tampered)
  - SAN (Subject Alternative Names): www.example.com, example.com, api.example.com
  - Key Usage: Digital Signature, Key Encipherment
  - Extended Validation: Organization details (for EV certs, green bar in old browsers)

View certificate:
  $ openssl s_client -connect example.com:443 -showcerts
  $ openssl x509 -in cert.pem -text -noout

Certificate Validation - OCSP & CRL

Problem: What if a certificate is compromised before expiry?

Solution 1: CRL (Certificate Revocation List):

CA publishes list of revoked certificates (serial numbers)
Browser downloads CRL periodically
Problem: Lists get huge, slow to download, privacy leak (who you're connecting to)

Solution 2: OCSP (Online Certificate Status Protocol):

Browser asks CA: "Is certificate serial# XYZ still valid?"
CA responds: Good / Revoked / Unknown
Problem: Privacy leak (CA sees every site you visit), latency, OCSP server downtime

Solution 3: OCSP Stapling:

Server queries OCSP, caches signed response
Server "staples" OCSP response to TLS handshake
Browser validates stapled response (signed by CA, recent timestamp)
Benefits: No client→CA query (privacy!), faster, CA downtime doesn't break sites

Cipher Suites Explained

Example: TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384

Component	Meaning
TLS	Protocol
ECDHE	Key Exchange: Elliptic Curve Diffie-Hellman Ephemeral (forward secrecy - session keys not recoverable even if private key compromised)
RSA	Authentication: Server's certificate uses RSA signature
AES_256_GCM	Encryption: AES 256-bit in Galois/Counter Mode (authenticated encryption)
SHA384	Hash: SHA-384 for PRF (Pseudo-Random Function) and HMAC

Weak ciphers to avoid: RC4, DES, 3DES, MD5, SHA-1, Export ciphers (512-bit keys)

Modern strong ciphers: AES-256-GCM, ChaCha20-Poly1305, ECDHE/DHE for forward secrecy

Common SSL/TLS Issues

Certificate Expired: Not renewed before expiry. Use Let's Encrypt (free, auto-renews every 90 days).
Hostname Mismatch: Certificate for example.com but accessing www.example.com. Use SAN (Subject Alternative Names) to cover multiple domains.
Self-Signed Certificate: Not signed by trusted CA. OK for internal/dev, browsers show warning in prod.
Incomplete Certificate Chain: Server only sends leaf cert, not intermediates. Browser can't verify chain.
Mixed Content: HTTPS page loading HTTP resources (images, scripts). Browser blocks for security.
Protocol Downgrade: Attacker forces client to use old TLS 1.0/SSL 3.0. Mitigated by disabling old protocols server-side.

Getting a Certificate

Free: Let's Encrypt (automated, 90-day certs, auto-renewal via certbot)

$ certbot certonly --webroot -w /var/www/html -d example.com -d www.example.com
Certificate saved: /etc/letsencrypt/live/example.com/fullchain.pem
Private key: /etc/letsencrypt/live/example.com/privkey.pem

Auto-renewal: certbot renew (cron job every 12 hours)

Paid: DigiCert, Sectigo, GlobalSign (EV certs, wildcard certs, support, insurance)

Testing TLS Configuration

SSL Labs: https://www.ssllabs.com/ssltest/ - Comprehensive TLS analysis (A+ rating)
testssl.sh: Command-line tool for testing TLS configuration
OpenSSL: openssl s_client -connect example.com:443 -tls1_2

Why JWT Was Invented

The Problem (Pre-2010s)

Session-based authentication: Server stores session data (user ID, roles, etc.) in memory or database. Client gets session ID cookie.

Issues with sessions in distributed systems:

State on server: Doesn't scale horizontally (sticky sessions or shared session store required)
Mobile apps: Cookies don't work well with native apps
Microservices: Every service needs access to session store (tight coupling)
Cross-domain: Sessions don't work across different domains (api.example.com vs app.example.com)

JWT Solution (RFC 7519, 2015): Self-contained tokens. All user info in token itself. Stateless - server doesn't store anything.

What is JWT?

JWT = JSON Web Token: A compact, URL-safe token format for securely transmitting information between parties. Digitally signed to prevent tampering.

JWT Structure: Three Parts (Header.Payload.Signature)

Example JWT:
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c

Decoded:

Part 1: HEADER (Base64URL encoded)
{
  "alg": "HS256",    // Algorithm: HMAC-SHA256
  "typ": "JWT"       // Type: JWT
}

Part 2: PAYLOAD (Base64URL encoded)
{
  "sub": "1234567890",           // Subject (user ID)
  "name": "John Doe",            // Custom claim
  "email": "john@example.com",   // Custom claim
  "role": "admin",               // Custom claim
  "iat": 1516239022,             // Issued At (Unix timestamp)
  "exp": 1516242622              // Expiration (1 hour later)
}

Part 3: SIGNATURE
HMACSHA256(
  base64UrlEncode(header) + "." + base64UrlEncode(payload),
  secret  // Server's secret key
)

The signature proves:
  1. Token hasn't been tampered with
  2. Token was issued by someone who knows the secret

Why Signing Exists - The Core Concept

Critical Understanding: JWT is NOT Encrypted

Anyone can decode and read a JWT (it's just Base64, not encryption). The signature doesn't hide the contents - it prevents tampering.

What signing achieves:

Integrity: If attacker changes payload (e.g., change role: "user" to role: "admin"), signature becomes invalid. Server detects tampering.
Authentication: Only someone with the secret key can create valid signatures. Proves token was issued by your server, not an attacker.

What signing does NOT do:

❌ Confidentiality: Payload is readable by anyone. Don't put passwords, SSNs, credit cards in JWT.
✓ Solution: Use JWE (JSON Web Encryption) if you need encrypted tokens, or just don't put sensitive data in JWT.

Analogy: JWT signature is like a tamper-evident seal on a glass bottle. You can see what's inside (it's not hidden), but if someone opens it and changes the contents, the seal breaks.

How JWT Signing Works

Symmetric Signing (HS256 - HMAC with SHA-256):

Server has secret key: "my-super-secret-key-12345"

Creating JWT:
1. Create header: {"alg":"HS256","typ":"JWT"}
2. Create payload: {"sub":"123","role":"admin","exp":1700000000}
3. Encode both as Base64URL
4. Compute signature:
   signature = HMAC-SHA256(header + "." + payload, secret)
5. Concatenate: header.payload.signature

Verifying JWT:
1. Split token into header, payload, signature
2. Recompute signature using header + payload + secret
3. Compare recomputed signature with token's signature
4. If match: Token is valid ✓
5. If different: Token was tampered ✗

Only someone with the secret can create valid signatures.

Asymmetric Signing (RS256 - RSA with SHA-256):

Server has:
  - Private key (signs tokens, kept secret)
  - Public key (verifies tokens, can be shared)

Creating JWT:
signature = RSA-Sign(header + "." + payload, privateKey)

Verifying JWT:
valid = RSA-Verify(header + "." + payload, signature, publicKey)

Advantage: API servers can verify tokens without knowing signing key
  Auth server: Signs with private key
  API servers: Verify with public key (can't create tokens, only verify)

Use case: Microservices - only auth service has private key

Standard JWT Claims (Payload)

Claim	Meaning	Example
iss	Issuer (who created token)	"https://auth.example.com"
sub	Subject (user ID)	"user123"
aud	Audience (who should accept token)	"https://api.example.com"
exp	Expiration (Unix timestamp)	1700000000 (Nov 14, 2023)
nbf	Not Before (token not valid until)	1699999000
iat	Issued At (when token created)	1699999000
jti	JWT ID (unique identifier)	"abc-123-def"

Custom claims: Add anything you need (role, permissions, email, etc.)

The Complete Flow

Architecture: Frontend (React/Vue) → Backend API (Node/Python/Go)

sequenceDiagram participant User as Browser participant Frontend as Web Server
(Frontend App) participant Auth as Auth API
(Login Service) participant API as Backend API
(Protected Resource) Note over User,API: 1. Login Flow User->>Frontend: Navigate to /login Frontend->>User: Display login form User->>Frontend: Submit credentials
(username, password) Frontend->>Auth: POST /api/auth/login
{username, password} Note over Auth: Verify credentials
(check database) Auth->>Auth: Generate JWT
(sign with secret) Auth->>Frontend: 200 OK
{token: "eyJhbG...", user: {...}} Frontend->>Frontend: Store token
(localStorage or httpOnly cookie) Frontend->>User: Redirect to dashboard Note over User,API: 2. Accessing Protected API User->>Frontend: Click "View Profile" Frontend->>API: GET /api/user/profile
Authorization: Bearer eyJhbG... Note over API: Extract token from header Note over API: Verify signature
(using secret key) Note over API: Check expiration Note over API: Extract user ID from payload API->>API: Fetch user data from DB API->>Frontend: 200 OK
{user: {...}} Frontend->>User: Display profile Note over User,API: 3. Token Expired User->>Frontend: Request after 1 hour Frontend->>API: GET /api/data
Authorization: Bearer eyJhbG... Note over API: Verify token
(exp claim expired) API->>Frontend: 401 Unauthorized
{error: "Token expired"} Frontend->>Auth: POST /api/auth/refresh
{refreshToken} Auth->>Frontend: 200 OK
{token: "new-token"} Frontend->>API: Retry GET /api/data
(with new token) API->>Frontend: 200 OK
{data: [...]}

Implementation Example

1. Login Endpoint (Auth Service)

# Python (Flask) - Auth Service
from flask import Flask, request, jsonify
import jwt
import datetime
from werkzeug.security import check_password_hash

app = Flask(__name__)
SECRET_KEY = "your-secret-key-keep-this-safe"  # Store in env var!

@app.route('/api/auth/login', methods=['POST'])
def login():
    data = request.get_json()
    username = data.get('username')
    password = data.get('password')

    # Verify credentials (pseudo-code)
    user = db.query("SELECT * FROM users WHERE username = ?", username)
    if not user or not check_password_hash(user.password_hash, password):
        return jsonify({"error": "Invalid credentials"}), 401

    # Generate JWT
    payload = {
        "sub": str(user.id),           # User ID
        "email": user.email,
        "role": user.role,             # "admin" or "user"
        "iat": datetime.datetime.utcnow(),
        "exp": datetime.datetime.utcnow() + datetime.timedelta(hours=1)
    }

    token = jwt.encode(payload, SECRET_KEY, algorithm="HS256")

    return jsonify({
        "token": token,
        "user": {
            "id": user.id,
            "email": user.email,
            "role": user.role
        }
    }), 200

2. Protected API Endpoint (Backend Service)

# Python (Flask) - Backend API
from flask import Flask, request, jsonify
from functools import wraps
import jwt

app = Flask(__name__)
SECRET_KEY = "your-secret-key-keep-this-safe"  # Same secret!

def require_jwt(f):
    """Decorator to protect routes with JWT"""
    @wraps(f)
    def decorated(*args, **kwargs):
        # Extract token from Authorization header
        auth_header = request.headers.get('Authorization')
        if not auth_header:
            return jsonify({"error": "Missing token"}), 401

        # Expected format: "Bearer eyJhbGciOiJ..."
        try:
            token = auth_header.split(" ")[1]  # Get token after "Bearer "
        except IndexError:
            return jsonify({"error": "Invalid token format"}), 401

        # Verify JWT
        try:
            payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"])

            # Token is valid, attach user info to request
            request.user_id = payload['sub']
            request.user_role = payload.get('role')

        except jwt.ExpiredSignatureError:
            return jsonify({"error": "Token expired"}), 401
        except jwt.InvalidTokenError:
            return jsonify({"error": "Invalid token"}), 401

        return f(*args, **kwargs)

    return decorated

@app.route('/api/user/profile', methods=['GET'])
@require_jwt  # This route requires valid JWT
def get_profile():
    user_id = request.user_id  # From JWT payload

    # Fetch user from database
    user = db.query("SELECT * FROM users WHERE id = ?", user_id)

    return jsonify({
        "id": user.id,
        "email": user.email,
        "name": user.name,
        "role": user.role
    }), 200

@app.route('/api/admin/users', methods=['GET'])
@require_jwt
def get_all_users():
    # Check if user has admin role
    if request.user_role != 'admin':
        return jsonify({"error": "Forbidden - admin only"}), 403

    users = db.query("SELECT * FROM users")
    return jsonify({"users": users}), 200

3. Frontend Implementation (JavaScript)

// React/Vue/Vanilla JS - Frontend

// Login function
async function login(username, password) {
  const response = await fetch('https://auth.example.com/api/auth/login', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ username, password })
  });

  if (response.ok) {
    const data = await response.json();

    // Store token (Option 1: localStorage)
    localStorage.setItem('token', data.token);

    // Store token (Option 2: httpOnly cookie - more secure)
    // Server sets: Set-Cookie: token=...; HttpOnly; Secure; SameSite=Strict

    return data;
  } else {
    throw new Error('Login failed');
  }
}

// Make authenticated API request
async function fetchUserProfile() {
  const token = localStorage.getItem('token');

  const response = await fetch('https://api.example.com/api/user/profile', {
    method: 'GET',
    headers: {
      'Authorization': `Bearer ${token}`  // Send JWT in header
    }
  });

  if (response.status === 401) {
    // Token expired or invalid, redirect to login
    window.location.href = '/login';
  }

  if (response.ok) {
    const profile = await response.json();
    return profile;
  }
}

// Axios interceptor (automatic token attachment)
axios.interceptors.request.use(config => {
  const token = localStorage.getItem('token');
  if (token) {
    config.headers.Authorization = `Bearer ${token}`;
  }
  return config;
});

// Handle 401 responses globally
axios.interceptors.response.use(
  response => response,
  error => {
    if (error.response?.status === 401) {
      localStorage.removeItem('token');
      window.location.href = '/login';
    }
    return Promise.reject(error);
  }
);

JWT Security Best Practices

Critical Security Considerations

1. Short Expiration Times

Access tokens: 15 minutes - 1 hour
Refresh tokens: 7 days - 30 days (stored securely, revocable)
Why: Stolen JWT can be used until expiration. Short expiry limits damage.

2. Never Store Sensitive Data in JWT

❌ BAD: {"password": "secret123", "ssn": "123-45-6789"}
✓ GOOD: {"sub": "user123", "role": "admin"}

JWT is Base64-encoded, not encrypted. Anyone can decode and read it.

3. Use Strong Secret Keys

❌ BAD: SECRET_KEY = "secret"
✓ GOOD: SECRET_KEY = "a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6"

Use: openssl rand -base64 32
Store in environment variables, never hardcode

4. Validate Everything

Signature (always)
Expiration (exp claim)
Issuer (iss claim - prevent token from dev env used in prod)
Audience (aud claim - prevent token for API A used for API B)

5. Storage Location

Storage	Pros	Cons
localStorage	Easy to use, persists across tabs	Vulnerable to XSS (JavaScript can read it)
sessionStorage	Cleared on tab close	Still vulnerable to XSS
httpOnly Cookie	Not accessible to JavaScript (XSS protection)	Vulnerable to CSRF (need CSRF tokens), can't access from different domain
Memory (Redux/Vuex)	Cleared on page refresh, XSS resistant	User logs out on refresh (bad UX)

Recommendation: httpOnly cookie with SameSite=Strict for web apps. localStorage for mobile/SPA if you trust your XSS protection.

6. HTTPS Only

Always use HTTPS. JWT in HTTP = plaintext password.

7. Token Revocation

Problem: JWT is stateless, can't be revoked before expiry.

Solutions:

Short expiry + refresh tokens: Refresh token stored in DB, can be revoked
Blacklist: Store revoked token IDs (jti claim) in Redis, check on verify
Whitelist: Store active sessions in Redis (defeats stateless purpose, but gives control)

Refresh Token Flow

Why refresh tokens?
  Access token: Short-lived (15 min), sent with every request
  Refresh token: Long-lived (30 days), only sent to refresh endpoint

If access token stolen: Expires in 15 min (limited damage)
If refresh token stolen: Can revoke in database (kill all sessions)

Flow:
1. Login: Get access token (15 min) + refresh token (30 days)
2. API requests: Send access token
3. Access token expires: Frontend gets 401
4. Frontend sends refresh token to /api/auth/refresh
5. Server checks refresh token in database (not revoked?)
6. Server issues new access token
7. Frontend retries request with new access token

Refresh token storage: Database with user_id, token_hash, expires_at
Revoke on logout: DELETE FROM refresh_tokens WHERE user_id = ?

Debugging JWTs

jwt.io: Paste JWT, see decoded header/payload, verify signature
Browser DevTools: Application tab → Local Storage / Cookies
Network tab: Check Authorization header in requests
Python: import jwt; jwt.decode(token, verify=False) (inspect without verifying)

Recommended YouTube Channels & Videos

DNS & Internet Infrastructure

ByteByteGo: ByteByteGo Channel - Search for "How does DNS work" on their channel for system design perspective
- Blog: How does the Domain Name System (DNS) lookup work?
- Blog: A Crash Course in DNS
NetworkChuck: What is DNS and How it Works - Beginner-friendly explanation with security aspects (DNS over HTTPS)
PowerCert Animated Videos: DNS Explained - Clear animated explanation

Email (SMTP, POP3, IMAP)

PowerCert: SMTP Explained
Networking Tutorials: POP3 vs IMAP Explained

Caching Strategies

ByteByteGo Blog: A Crash Course in Caching - Part 1
ByteByteGo Blog: Top Caching Strategies
ByteByteGo Blog: The 6 Most Impactful Ways Redis is Used in Production
Gaurav Sen: Search Gaurav Sen's Channel for "caching" and "distributed systems" - 25+ video system design playlist

Rate Limiting

ByteByteGo Blog: Rate Limiting Fundamentals
ByteByteGo Blog: A Guide to Rate Limiting Strategies
Hussein Nasser: Search his channel for "rate limiting" discussions in backend engineering context

System Design & Distributed Systems

Gaurav Sen - System Design: GKCS Channel - 25+ videos covering distributed caching, CDNs, message queues, event-driven architecture
ByteByteGo: ByteByteGo System Design - Visual system design explanations