Kubernetes Architecture - Complete Guide

Kubernetes Cluster Architecture

graph LR KUBECTL["kubectl"] -.->|"Commands"| API subgraph CP["CONTROL PLANE (Master)"] API["API Server
(kube-apiserver)"] SCHED["Scheduler
(kube-scheduler)"] CTRL["Controller Manager
(kube-controller-manager)"] ETCD["etcd
(Distributed key-value store)"] API --> ETCD SCHED --> API CTRL --> API end subgraph WN["WORKER NODES"] subgraph N1["Node 1"] KUB1["kubelet"] PROXY1["kube-proxy"] RT1["Container Runtime
(containerd)"] subgraph PODS1["Pods"] POD1["Pod1"] POD2["Pod2"] POD3["Pod3"] POD4["Pod4"] end KUB1 --> PODS1 PROXY1 --> PODS1 RT1 --> PODS1 end N2["Node 2, Node 3, ...
(same structure)"] end API -.->|"Manages"| KUB1 API -.->|"Manages"| PROXY1 style CP fill:#e3f2fd,stroke:#1976d2,stroke-width:3px,color:#2e3440 style WN fill:#f3e5f5,stroke:#7b1fa2,stroke-width:3px,color:#2e3440 style N1 fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#2e3440 style PODS1 fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style API fill:#bbdefb,stroke:#1976d2,stroke-width:2px,color:#2e3440 style SCHED fill:#bbdefb,stroke:#1976d2,stroke-width:2px,color:#2e3440 style CTRL fill:#bbdefb,stroke:#1976d2,stroke-width:2px,color:#2e3440 style ETCD fill:#ffccbc,stroke:#e64a19,stroke-width:2px,color:#2e3440 style KUBECTL fill:#fff9c4,stroke:#f57f17,stroke-width:2px,color:#2e3440

Kubernetes Resource Relationships

1. Nodes and Pods Relationship

Concept: Nodes are physical/virtual machines. Pods run on Nodes.

graph TB subgraph CLUSTER["Kubernetes Cluster"] subgraph NODE1["Node 1 (Worker Machine)
IP: 10.0.1.5"] POD1A["Pod: web-app-1
IP: 192.168.1.10
Containers: nginx"] POD1B["Pod: web-app-2
IP: 192.168.1.11
Containers: nginx"] POD1C["Pod: cache-1
IP: 192.168.1.12
Containers: redis"] end subgraph NODE2["Node 2 (Worker Machine)
IP: 10.0.1.6"] POD2A["Pod: web-app-3
IP: 192.168.2.10
Containers: nginx"] POD2B["Pod: db-1
IP: 192.168.2.11
Containers: postgres"] end subgraph NODE3["Node 3 (Worker Machine)
IP: 10.0.1.7"] POD3A["Pod: web-app-4
IP: 192.168.3.10
Containers: nginx"] POD3B["Pod: worker-1
IP: 192.168.3.11
Containers: python"] end end style NODE1 fill:#fff3e0,stroke:#f57c00,stroke-width:3px,color:#2e3440 style NODE2 fill:#fff3e0,stroke:#f57c00,stroke-width:3px,color:#2e3440 style NODE3 fill:#fff3e0,stroke:#f57c00,stroke-width:3px,color:#2e3440 style POD1A fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style POD1B fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style POD1C fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style POD2A fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style POD2B fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style POD3A fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style POD3B fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440

Key Points:

Each Node can run multiple Pods
Pods get unique IP addresses (pod network)
Scheduler decides which Node runs which Pod
Nodes have resources (CPU, Memory) that Pods consume

2. ReplicaSet: Managing Pod Replicas

Concept: ReplicaSet ensures N identical Pods are always running.

graph TB RS["ReplicaSet: web-app
Desired: 3 replicas
Selector: app=web"] RS -->|"Creates & Manages"| POD1["Pod: web-app-abc123
Labels: app=web
Status: Running"] RS -->|"Creates & Manages"| POD2["Pod: web-app-def456
Labels: app=web
Status: Running"] RS -->|"Creates & Manages"| POD3["Pod: web-app-ghi789
Labels: app=web
Status: Running"] DEAD["Pod: web-app-xyz
Status: Failed ❌"] RS -.->|"Detects failure
Creates replacement"| POD3 DEAD -.->|"Was managed by"| RS NODE1["Node 1"] -.->|"Runs"| POD1 NODE2["Node 2"] -.->|"Runs"| POD2 NODE3["Node 3"] -.->|"Runs"| POD3 style RS fill:#bbdefb,stroke:#1976d2,stroke-width:3px,color:#2e3440 style POD1 fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style POD2 fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style POD3 fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style DEAD fill:#ffcdd2,stroke:#c62828,stroke-width:2px,color:#2e3440 style NODE1 fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#2e3440 style NODE2 fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#2e3440 style NODE3 fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#2e3440

How it works:

ReplicaSet continuously monitors Pod count
If Pod crashes → ReplicaSet creates replacement
If you delete a Pod → ReplicaSet creates new one
Pods are matched by labels (app=web)
All Pods are identical (same container image/config)

3. DaemonSet: One Pod Per Node

Concept: DaemonSet ensures exactly ONE Pod runs on every Node (or matching Nodes).

graph TB DS["DaemonSet: log-collector
Runs on: ALL nodes"] subgraph CLUSTER["Cluster"] subgraph NODE1["Node 1"] POD1["Pod: log-collector-node1
Collects logs from Node 1"] APP1A["App Pod 1"] APP1B["App Pod 2"] end subgraph NODE2["Node 2"] POD2["Pod: log-collector-node2
Collects logs from Node 2"] APP2A["App Pod 3"] end subgraph NODE3["Node 3"] POD3["Pod: log-collector-node3
Collects logs from Node 3"] APP3A["App Pod 4"] APP3B["App Pod 5"] end end DS -->|"Ensures 1 Pod on"| NODE1 DS -->|"Ensures 1 Pod on"| NODE2 DS -->|"Ensures 1 Pod on"| NODE3 POD1 -.->|"Monitors"| APP1A POD1 -.->|"Monitors"| APP1B POD2 -.->|"Monitors"| APP2A POD3 -.->|"Monitors"| APP3A POD3 -.->|"Monitors"| APP3B style DS fill:#ce93d8,stroke:#8e24aa,stroke-width:3px,color:#2e3440 style NODE1 fill:#fff3e0,stroke:#f57c00,stroke-width:3px,color:#2e3440 style NODE2 fill:#fff3e0,stroke:#f57c00,stroke-width:3px,color:#2e3440 style NODE3 fill:#fff3e0,stroke:#f57c00,stroke-width:3px,color:#2e3440 style POD1 fill:#e1bee7,stroke:#8e24aa,stroke-width:2px,color:#2e3440 style POD2 fill:#e1bee7,stroke:#8e24aa,stroke-width:2px,color:#2e3440 style POD3 fill:#e1bee7,stroke:#8e24aa,stroke-width:2px,color:#2e3440 style APP1A fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style APP1B fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style APP2A fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style APP3A fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style APP3B fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440

Common DaemonSet Use Cases:

Log Collection: Fluentd, Logstash on every node
Monitoring: Node exporters, monitoring agents
Storage: Storage drivers (Ceph, GlusterFS)
Networking: Network plugins (kube-proxy, Calico)

4. Sidecar Pattern: Multiple Containers in One Pod

Concept: Pods can have multiple containers that share resources.

graph TB subgraph POD["Pod: web-app-with-sidecar
IP: 192.168.1.10"] subgraph SHARED["Shared Resources"] NETWORK["Shared Network
(localhost)"] VOLUME["Shared Volume
(/var/log)"] end MAIN["Main Container
nginx:1.21
Port: 80
Writes logs to /var/log/nginx/"] SIDECAR["Sidecar Container
fluentd
Reads logs from /var/log/nginx/
Sends to Elasticsearch"] MAIN -->|"Shares"| NETWORK MAIN -->|"Writes to"| VOLUME SIDECAR -->|"Shares"| NETWORK SIDECAR -->|"Reads from"| VOLUME end EXTERNAL["External Log Storage
(Elasticsearch)"] SIDECAR -.->|"Forwards logs"| EXTERNAL style POD fill:#e8f5e9,stroke:#388e3c,stroke-width:3px,color:#2e3440 style MAIN fill:#bbdefb,stroke:#1976d2,stroke-width:2px,color:#2e3440 style SIDECAR fill:#fff9c4,stroke:#f57f17,stroke-width:2px,color:#2e3440 style NETWORK fill:#f3e5f5,stroke:#9c27b0,stroke-width:2px,color:#2e3440 style VOLUME fill:#ffe0b2,stroke:#ff6f00,stroke-width:2px,color:#2e3440 style SHARED fill:#fafafa,stroke:#757575,stroke-width:2px,color:#2e3440 style EXTERNAL fill:#c5e1a5,stroke:#558b2f,stroke-width:2px,color:#2e3440

Sidecar Benefits:

Shared Network: Containers communicate via localhost
Shared Storage: Containers can share volumes
Same Lifecycle: Started/stopped together
Co-located: Always on same Node

Common Sidecar Patterns:

Logging: Sidecar collects/forwards logs
Proxying: Envoy/Istio sidecar for service mesh
Monitoring: Metrics collection sidecar
Security: Authentication/authorization proxy

5. Complete Hierarchy: Deployment → ReplicaSet → Pods

Concept: Deployments manage ReplicaSets, which manage Pods.

graph TB DEP["Deployment: web-app
Replicas: 3
Image: nginx:1.21"] RS_NEW["ReplicaSet: web-app-v2
Replicas: 3
Current"] RS_OLD["ReplicaSet: web-app-v1
Replicas: 0
Kept for rollback"] DEP -->|"Creates/Manages"| RS_NEW DEP -.->|"Keeps for rollback"| RS_OLD RS_NEW -->|"Manages"| POD1["Pod: web-app-v2-abc
nginx:1.21"] RS_NEW -->|"Manages"| POD2["Pod: web-app-v2-def
nginx:1.21"] RS_NEW -->|"Manages"| POD3["Pod: web-app-v2-ghi
nginx:1.21"] RS_OLD -.->|"Previously managed"| POD_OLD["Pod: web-app-v1-xyz
nginx:1.20
(terminated)"] subgraph NODES["Distributed Across Nodes"] NODE1["Node 1"] -.-> POD1 NODE2["Node 2"] -.-> POD2 NODE3["Node 3"] -.-> POD3 end style DEP fill:#90caf9,stroke:#0277bd,stroke-width:3px,color:#2e3440 style RS_NEW fill:#bbdefb,stroke:#1976d2,stroke-width:3px,color:#2e3440 style RS_OLD fill:#e0e0e0,stroke:#616161,stroke-width:2px,stroke-dasharray: 5 5,color:#2e3440 style POD1 fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style POD2 fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style POD3 fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#2e3440 style POD_OLD fill:#ffcdd2,stroke:#c62828,stroke-width:2px,stroke-dasharray: 5 5,color:#2e3440 style NODE1 fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#2e3440 style NODE2 fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#2e3440 style NODE3 fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#2e3440 style NODES fill:#fafafa,stroke:#757575,stroke-width:2px,color:#2e3440

Why this hierarchy?

Deployment: Declarative updates, rollbacks, versioning
ReplicaSet: Ensures Pod count, self-healing
Pod: Runs actual containers

Update Process:

You update Deployment (change image nginx:1.20 → nginx:1.21)
Deployment creates NEW ReplicaSet (web-app-v2)
New ReplicaSet scales UP (creates 3 new Pods)
Old ReplicaSet scales DOWN (terminates old Pods)
Old ReplicaSet kept with 0 replicas (for rollback)

Control Plane Components

1. API Server (kube-apiserver)

Purpose: The front-end of the Kubernetes control plane. All components communicate through the API server.

How it works:

Exposes Kubernetes API (RESTful)
Validates and processes API requests
Updates etcd with cluster state
Only component that talks directly to etcd
Authenticates and authorizes requests (RBAC)
Serves as the gateway for kubectl, controllers, scheduler

Request Flow:

kubectl sends request to API server
API server authenticates and authorizes
API server validates the request
API server writes to etcd
API server returns response

Interview Tip: The API server is stateless and horizontally scalable. It's the only component that directly accesses etcd. All cluster state changes go through the API server.

3. Scheduler (kube-scheduler)

Purpose: Assigns Pods to Nodes based on resource requirements and constraints.

How it works:

Watches for newly created Pods with no assigned Node
Filters nodes (eliminates unsuitable nodes)
Scores nodes (ranks remaining nodes)
Selects best node and binds Pod to it

Scheduling Process:

Filtering: Remove nodes that don't meet requirements
- Insufficient CPU/memory
- Node selectors don't match
- Taints/tolerations conflicts
- Volume constraints
Scoring: Rank remaining nodes
- Resource availability
- Pod spreading (balance across nodes)
- Affinity/anti-affinity rules
Binding: Assign Pod to highest-scoring node

Interview Tip: Scheduler only assigns Pods to Nodes. kubelet actually runs the Pod. You can write custom schedulers if needed.

4. Controller Manager (kube-controller-manager)

Purpose: Runs controller processes that regulate the cluster state.

How it works:

Watches the cluster state via API server
Makes changes to move current state → desired state
Runs many controllers in a single process

Key Controllers: - Node Controller: Monitors node health, marks unavailable - Replication Controller: Maintains correct number of Pods - Endpoints Controller: Populates Endpoints (Services + Pods) - Service Account Controller: Creates default service accounts - Namespace Controller: Manages namespace lifecycle - Deployment Controller: Manages ReplicaSets for Deployments - StatefulSet Controller: Manages StatefulSets - Job Controller: Manages Jobs and CronJobs

Control Loop (Reconciliation):

Read desired state from API server
Read current state from API server
Compare desired vs current
Take action to reconcile (create, update, delete resources)
Update status in API server
Repeat

Interview Tip: Controllers implement the "reconciliation loop" - continuously working to make actual state match desired state. This is Kubernetes' core operating principle.

Node (Worker) Components

5. kubelet

Purpose: Agent that runs on each worker node, ensuring containers are running in Pods.

How it works:

Registers node with the API server
Watches for Pod assignments to its node
Pulls container images
Starts/stops containers via container runtime
Reports Pod and node status to API server
Runs liveness/readiness probes
Mounts volumes

kubelet Workflow:

API server assigns Pod to node
kubelet receives Pod spec
kubelet tells container runtime to pull images
kubelet creates volumes if needed
kubelet tells container runtime to start containers
kubelet monitors container health
kubelet reports status back to API server

Interview Tip: kubelet is the "node agent". It doesn't manage containers that weren't created by Kubernetes. It communicates with the container runtime via CRI (Container Runtime Interface).

6. kube-proxy

Purpose: Network proxy that maintains network rules for Pod communication.

How it works:

Runs on every node
Watches API server for Service and Endpoint changes
Maintains network rules (iptables or IPVS)
Enables communication to Services (load balancing)
Performs connection forwarding

Modes: - iptables mode (default): Uses iptables rules for load balancing - IPVS mode: Uses IPVS (Linux Virtual Server) for better performance - userspace mode (legacy): Proxies connections in userspace

Service Access Flow:

Client Pod sends request to Service IP (ClusterIP)
kube-proxy intercepts via iptables/IPVS rules
kube-proxy load balances to backend Pod
Traffic forwarded to selected Pod

Interview Tip: kube-proxy doesn't actually proxy traffic in most modes. It programs iptables/IPVS rules, and the kernel handles the actual routing.

8. CNI (Container Network Interface)

Purpose: Plugin interface for configuring network interfaces in containers.

How it works:

Assigns IP addresses to Pods
Sets up network routing between Pods
Enables Pod-to-Pod communication across nodes
Implements network policies (firewalling)

Popular CNI Plugins: - Calico: L3 networking, network policies, BGP routing - Flannel: Simple overlay network (VXLAN) - Weave Net: Simple setup, encrypts traffic - Cilium: eBPF-based, advanced observability - AWS VPC CNI: Native AWS networking

Pod Networking:

kubelet calls CNI plugin when Pod starts
CNI assigns IP from Pod CIDR range
CNI sets up virtual network interface
CNI configures routes for Pod communication
Pod can now communicate with other Pods

Interview Tip: Kubernetes network model requires: 1) All Pods can communicate without NAT, 2) All nodes can communicate with all Pods, 3) Each Pod has its own IP.

kubectl - The Kubernetes CLI

kubectl

Purpose: Command-line tool for interacting with Kubernetes clusters.

Common Commands:

# Get resources
kubectl get pods
kubectl get nodes
kubectl get services
kubectl get deployments

# Describe (detailed info)
kubectl describe pod my-pod
kubectl describe node node-1

# Create resources
kubectl create -f deployment.yaml
kubectl apply -f service.yaml

# Update resources
kubectl edit deployment my-app
kubectl scale deployment my-app --replicas=5

# Delete resources
kubectl delete pod my-pod
kubectl delete -f deployment.yaml

# Logs and debugging
kubectl logs my-pod
kubectl logs -f my-pod  # follow
kubectl exec -it my-pod -- /bin/bash

# Port forwarding
kubectl port-forward pod/my-pod 8080:80

# Labels and selectors
kubectl get pods -l app=nginx
kubectl label pods my-pod env=prod

Interview Tip: kubectl talks to the API server. It reads config from ~/.kube/config which contains cluster info, credentials, and context.

Kubernetes Resource Types

Pod

Smallest deployable unit. One or more containers that share network and storage.

apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod
spec:
  containers:
  - name: nginx
    image: nginx:1.21
    ports:
    - containerPort: 80

Use case: Basic unit, but usually managed by higher-level resources.

ReplicaSet

Maintains a stable set of replica Pods. Ensures specified number of Pods are running.

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: nginx-rs
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    # Pod template here

Use case: Rarely used directly; Deployments manage ReplicaSets.

Deployment

Manages ReplicaSets and provides declarative updates. Most common way to run stateless apps.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.21

Features: Rolling updates, rollback, scaling, self-healing.

StatefulSet

For stateful applications. Provides stable network identity and persistent storage.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: "mysql"
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    # Pod template
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

Use case: Databases, distributed systems (Kafka, Cassandra).

Features: Ordered deployment/scaling, stable network IDs (pod-0, pod-1), persistent volumes.

DaemonSet

Runs a copy of a Pod on every node.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    # Pod template

Use case: Logging agents (Fluentd), monitoring (Prometheus node exporter), CNI plugins.

Job

Runs a task to completion. For batch processing.

apiVersion: batch/v1
kind: Job
metadata:
  name: pi-calculation
spec:
  completions: 5
  parallelism: 2
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl", "-Mbignum=bpi",
                  "-wle", "print bpi(2000)"]
      restartPolicy: Never

Use case: Data processing, migrations, batch jobs.

CronJob

Runs Jobs on a schedule.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: backup-job
spec:
  schedule: "0 2 * * *"  # 2 AM daily
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: backup:latest
          restartPolicy: OnFailure

Use case: Backups, report generation, cleanup tasks.

Service Types

Service Type	Description	Use Case	Access Method
ClusterIP (default)	Exposes Service on cluster-internal IP. Only reachable from within cluster.	Internal microservices communication	ClusterIP:Port (e.g., 10.96.0.1:80)
NodePort	Exposes Service on each Node's IP at a static port (30000-32767).	Development, testing, quick external access	NodeIP:NodePort (e.g., 192.168.1.10:30080)
LoadBalancer	Creates external load balancer (cloud provider). Assigns external IP.	Production external access on cloud platforms	External IP provided by cloud (e.g., AWS ELB)
ExternalName	Maps Service to external DNS name (CNAME).	Access external services (RDS, external APIs)	DNS name (e.g., database.example.com)

Service Examples

ClusterIP Service

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  type: ClusterIP  # default
  selector:
    app: nginx
  ports:
  - protocol: TCP
    port: 80        # Service port
    targetPort: 8080 # Container port

LoadBalancer Service

apiVersion: v1
kind: Service
metadata:
  name: my-lb-service
spec:
  type: LoadBalancer
  selector:
    app: web
  ports:
  - protocol: TCP
    port: 80
    targetPort: 8080
  # Cloud provider provisions external LB

Headless Service (for StatefulSet)

apiVersion: v1
kind: Service
metadata:
  name: mysql
spec:
  clusterIP: None  # Headless!
  selector:
    app: mysql
  ports:
  - port: 3306
# Provides DNS for each Pod: mysql-0.mysql, mysql-1.mysql, etc.

Additional Important Resources

ConfigMap

Store non-sensitive configuration data.

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  database_url: "postgres://db:5432"
  log_level: "info"

Usage: Environment variables or mounted as files.

Secret

Store sensitive data (passwords, tokens).

apiVersion: v1
kind: Secret
metadata:
  name: db-secret
type: Opaque
data:
  password: cGFzc3dvcmQ=  # base64

Note: Base64 encoded, not encrypted. Use external secret managers for production.

PersistentVolume (PV)

Cluster resource representing storage.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-1
spec:
  capacity:
    storage: 10Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: slow
  hostPath:
    path: /mnt/data

PersistentVolumeClaim (PVC)

Request for storage by a user.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-1
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: slow

Workflow: User creates PVC → K8s binds to matching PV → Pod uses PVC.

Ingress

HTTP/HTTPS routing to Services.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
spec:
  rules:
  - host: example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80

Requires: Ingress Controller (Nginx, Traefik, HAProxy).

Namespace

Virtual cluster for resource isolation.

apiVersion: v1
kind: Namespace
metadata:
  name: production

Use case: Separate dev/staging/prod, multi-tenancy, resource quotas.

Default namespaces: default, kube-system, kube-public, kube-node-lease.

Node Labels and Selectors

Node Labels

Key-value pairs attached to nodes for organization and scheduling.

# Label a node
kubectl label nodes node-1 disktype=ssd
kubectl label nodes node-2 environment=production

# View labels
kubectl get nodes --show-labels

Node Selector (Simple)

Schedule Pods only on nodes with specific labels.

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx
  nodeSelector:
    disktype: ssd  # Only schedule on nodes with this label

Node Affinity (Advanced)

More expressive than nodeSelector, with soft/hard requirements.

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: disktype
            operator: In
            values:
            - ssd
            - nvme
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 1
        preference:
          matchExpressions:
          - key: environment
            operator: In
            values:
            - production
  containers:
  - name: nginx
    image: nginx

Taints and Tolerations

Prevent Pods from scheduling on nodes unless they tolerate the taint.

# Taint a node (repel Pods)
kubectl taint nodes node-1 key=value:NoSchedule

# Pod with toleration (allows scheduling on tainted node)
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  tolerations:
  - key: "key"
    operator: "Equal"
    value: "value"
    effect: "NoSchedule"
  containers:
  - name: nginx
    image: nginx

Use cases: Dedicated nodes (GPU, high-memory), node maintenance, workload isolation.

Key Interview Concepts

How does Kubernetes achieve high availability?

Control plane: Multiple API servers, schedulers, controllers (leader election)
etcd: Clustered (3 or 5 instances) with Raft consensus
Worker nodes: Multiple nodes, Pods distributed across nodes
Self-healing: Controllers restart failed Pods, reschedule from failed nodes

How does a Pod get created? (End-to-end flow)

User runs kubectl create -f pod.yaml
kubectl sends request to API server
API server validates, authenticates, authorizes
API server writes Pod spec to etcd
Scheduler watches for unassigned Pods
Scheduler selects a node and binds Pod to it (updates etcd)
kubelet on that node watches for new Pod assignments
kubelet tells container runtime to pull image and start containers
Container runtime starts containers
kubelet reports Pod status to API server
kube-proxy updates network rules for Service discovery

How does Service discovery work?

DNS: CoreDNS provides DNS resolution (my-service.namespace.svc.cluster.local)
Environment variables: K8s injects Service IPs as env vars
ClusterIP: Virtual IP for Services, load balanced by kube-proxy

Deployment vs StatefulSet vs DaemonSet

Aspect	Deployment	StatefulSet	DaemonSet
Use case	Stateless apps (web servers, APIs)	Stateful apps (databases, Kafka)	Node-level services (logging, monitoring)
Pod identity	Interchangeable, random names	Stable, ordered (pod-0, pod-1)	One per node
Scaling	Unordered, parallel	Ordered (pod-0 before pod-1)	Auto-scales with cluster
Storage	Ephemeral or shared volumes	Persistent, per-Pod storage	Usually host volumes

Rolling Update Process

User updates Deployment (new image version)
Deployment controller creates new ReplicaSet
New ReplicaSet scales up (creates new Pods)
Old ReplicaSet scales down (terminates old Pods)
Process continues until all Pods are new version
Old ReplicaSet kept for rollback (history)

Parameters: maxSurge (extra Pods during update), maxUnavailable (Pods down during update)

Summary

Key Takeaways for Interviews

Architecture: Control plane (API server, etcd, scheduler, controller manager) + Worker nodes (kubelet, kube-proxy, container runtime)
API Server: Central hub, all communication goes through it, only component that talks to etcd
etcd: Stores cluster state, uses Raft consensus, critical for cluster operation
Scheduler: Assigns Pods to Nodes based on resources and constraints
Controllers: Reconciliation loops that make actual state match desired state
kubelet: Node agent that runs Pods, reports status
kube-proxy: Network proxy for Service load balancing
CNI: Network plugin for Pod networking
Pods: Smallest unit, usually managed by Deployments/StatefulSets
Services: Load balancing and service discovery (ClusterIP, NodePort, LoadBalancer)
Deployments: Manage stateless apps with rolling updates
StatefulSets: Manage stateful apps with stable identities
Self-healing: Controllers restart failed Pods automatically

☸️ Kubernetes Architecture - Complete Guide

Kubernetes Cluster Architecture

Kubernetes Resource Relationships

1. Nodes and Pods Relationship

2. ReplicaSet: Managing Pod Replicas

3. DaemonSet: One Pod Per Node

4. Sidecar Pattern: Multiple Containers in One Pod

5. Complete Hierarchy: Deployment → ReplicaSet → Pods

Control Plane Components

Node (Worker) Components

kubectl - The Kubernetes CLI

Kubernetes Resource Types

Pod

ReplicaSet

Deployment

StatefulSet

DaemonSet

Job

CronJob

Service Types

Service Examples

ClusterIP Service

LoadBalancer Service

Headless Service (for StatefulSet)

Additional Important Resources

ConfigMap

Secret

PersistentVolume (PV)

PersistentVolumeClaim (PVC)

Ingress

Namespace

Node Labels and Selectors

Node Labels

Node Selector (Simple)

Node Affinity (Advanced)

Taints and Tolerations

Key Interview Concepts

How does Kubernetes achieve high availability?

How does a Pod get created? (End-to-end flow)

How does Service discovery work?

Deployment vs StatefulSet vs DaemonSet

Rolling Update Process

Summary

Key Takeaways for Interviews