Software Testing - Complete Guide

← Home Why Test Test Pyramid Unit Tests Integration E2E TDD BDD Best Practices

Why Testing Matters

Benefits of Testing

Catch bugs early: Cheaper to fix bugs during development than in production
Enable refactoring: Confidence to change code without breaking functionality
Living documentation: Tests show how code should be used
Faster debugging: Failing tests pinpoint exact problem
Better design: Testable code tends to be well-designed (loose coupling, single responsibility)
Regression prevention: Ensure old bugs don't resurface
Continuous deployment: Automated tests enable safe, frequent releases

Cost of bugs over time:

During development: $1
During QA: $10
In production: $100
After customer impact: $1000+

Testing is an investment that pays dividends.

The Testing Pyramid

▲

/ \

/ E2E \ Slow, Expensive, Brittle

/-------\ Few tests

/ \

/Integration\ Medium Speed, Medium Cost

/-------------\ Moderate tests

/ \

/ Unit Tests \ Fast, Cheap, Stable

/-----------------\ Many tests

Pyramid Principles

Test Type	Scope	Speed	Cost	Quantity
Unit	Single function/class	Milliseconds	Very low	70-80%
Integration	Multiple components	Seconds	Medium	15-25%
E2E	Entire system	Minutes	High	5-10%

Why pyramid shape?

Base (Unit): Fast feedback, easy to maintain, pinpoint failures
Middle (Integration): Test interactions, realistic scenarios
Top (E2E): Critical user journeys, full system validation

Inverted pyramid (mostly E2E tests) = slow, flaky, expensive test suite

Unit Testing

What is Unit Testing?

Test a single "unit" of code (function, method, class) in isolation from dependencies.

Characteristics

Fast: Run in milliseconds
Isolated: No database, network, file system
Repeatable: Same result every time
Independent: Can run in any order
Self-validating: Pass/fail, no manual inspection

Python (pytest):

# calculator.py
def add(a, b):
    return a + b

def divide(a, b):
    if b == 0:
        raise ValueError("Cannot divide by zero")
    return a / b

# test_calculator.py
import pytest
from calculator import add, divide

def test_add_positive_numbers():
    assert add(2, 3) == 5

def test_add_negative_numbers():
    assert add(-1, -1) == -2

def test_divide_normal():
    assert divide(10, 2) == 5

def test_divide_by_zero_raises_error():
    with pytest.raises(ValueError, match="Cannot divide by zero"):
        divide(10, 0)

# Run: pytest test_calculator.py
# Output:
# test_calculator.py ....  [100%]
# 4 passed in 0.02s

JavaScript (Jest):

// calculator.js
function add(a, b) {
    return a + b;
}

function divide(a, b) {
    if (b === 0) {
        throw new Error("Cannot divide by zero");
    }
    return a / b;
}

module.exports = { add, divide };

// calculator.test.js
const { add, divide } = require('./calculator');

describe('Calculator', () => {
    describe('add', () => {
        it('should add two positive numbers', () => {
            expect(add(2, 3)).toBe(5);
        });

        it('should add negative numbers', () => {
            expect(add(-1, -1)).toBe(-2);
        });
    });

    describe('divide', () => {
        it('should divide numbers correctly', () => {
            expect(divide(10, 2)).toBe(5);
        });

        it('should throw error when dividing by zero', () => {
            expect(() => divide(10, 0)).toThrow('Cannot divide by zero');
        });
    });
});

// Run: npm test
// Output:
// PASS  calculator.test.js
//   Calculator
//     add
//       ✓ should add two positive numbers (2ms)
//       ✓ should add negative numbers (1ms)
//     divide
//       ✓ should divide numbers correctly (1ms)
//       ✓ should throw error when dividing by zero (1ms)

Test Doubles: Mocks, Stubs, Fakes, Spies

Replace dependencies to isolate the unit under test.

1. Stub

Provides canned responses to calls. No behavior verification.

# Stub: Returns predetermined value
class EmailServiceStub:
    def send_email(self, to, subject, body):
        return True  # Always succeeds

def test_user_registration():
    email_service = EmailServiceStub()
    user_service = UserService(email_service)

    result = user_service.register("alice@example.com")

    assert result.success == True

2. Mock

Records interactions and allows verification of how it was called.

from unittest.mock import Mock

def test_user_registration_sends_welcome_email():
    email_service = Mock()
    user_service = UserService(email_service)

    user_service.register("alice@example.com")

    # Verify email service was called with correct arguments
    email_service.send_email.assert_called_once_with(
        to="alice@example.com",
        subject="Welcome!",
        body=Mock.ANY
    )

3. Fake

Working implementation, but simplified (e.g., in-memory database).

class FakeDatabase:
    def __init__(self):
        self.users = {}

    def save(self, user):
        self.users[user.id] = user

    def find_by_id(self, user_id):
        return self.users.get(user_id)

def test_user_repository():
    db = FakeDatabase()  # Fake, not real database
    repo = UserRepository(db)

    user = User(id=1, name="Alice")
    repo.save(user)

    found = repo.find_by_id(1)
    assert found.name == "Alice"

4. Spy

Records information about how it was called, but uses real implementation.

from unittest.mock import MagicMock

def test_cache_usage():
    cache = MagicMock(wraps=RealCache())  # Spy: real behavior + tracking

    service = DataService(cache)
    service.get_data("key123")
    service.get_data("key123")  # Second call should use cache

    assert cache.get.call_count == 2
    assert cache.set.call_count == 1  # Only set once

Type	Purpose	Verifies Behavior
Stub	Provide predetermined responses	No
Mock	Verify interactions	Yes
Fake	Simplified working implementation	No
Spy	Track calls on real object	Yes

Integration Testing

What is Integration Testing?

Test interactions between multiple components/modules/services. Verify they work together correctly.

What to Test

Database queries and transactions
API endpoints
Message queue producers/consumers
File system operations
External service integrations

Database Integration Test:

import pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from myapp.models import User, Base
from myapp.repositories import UserRepository

@pytest.fixture
def db_session():
    # Setup: Create in-memory test database
    engine = create_engine('sqlite:///:memory:')
    Base.metadata.create_all(engine)
    Session = sessionmaker(bind=engine)
    session = Session()

    yield session  # Provide session to test

    # Teardown: Close session
    session.close()

def test_user_repository_create_and_find(db_session):
    repo = UserRepository(db_session)

    # Create user
    user = User(name="Alice", email="alice@example.com")
    repo.save(user)

    # Find user
    found = repo.find_by_email("alice@example.com")

    assert found is not None
    assert found.name == "Alice"
    assert found.email == "alice@example.com"

def test_user_repository_update(db_session):
    repo = UserRepository(db_session)

    user = User(name="Bob", email="bob@example.com")
    repo.save(user)

    # Update user
    user.name = "Robert"
    repo.save(user)

    # Verify update
    found = repo.find_by_email("bob@example.com")
    assert found.name == "Robert"

API Integration Test:

import pytest
from fastapi.testclient import TestClient
from myapp.main import app

@pytest.fixture
def client():
    return TestClient(app)

def test_create_user_endpoint(client):
    response = client.post("/users", json={
        "name": "Alice",
        "email": "alice@example.com"
    })

    assert response.status_code == 201
    data = response.json()
    assert data["name"] == "Alice"
    assert data["email"] == "alice@example.com"
    assert "id" in data

def test_get_user_endpoint(client):
    # Setup: Create user
    create_response = client.post("/users", json={
        "name": "Bob",
        "email": "bob@example.com"
    })
    user_id = create_response.json()["id"]

    # Test: Get user
    response = client.get(f"/users/{user_id}")

    assert response.status_code == 200
    data = response.json()
    assert data["name"] == "Bob"

def test_get_nonexistent_user_returns_404(client):
    response = client.get("/users/99999")
    assert response.status_code == 404

Challenges

Slower: Real database, network calls
Setup complexity: Need test databases, containers
Data cleanup: Reset state between tests
Flakiness: Network timeouts, race conditions

Best Practices

Use test databases (separate from development/production)
Use transactions and rollback for cleanup
Use Docker containers for consistent environments
Test critical integration points, not every combination

End-to-End (E2E) Testing

What is E2E Testing?

Test complete user workflows through the entire system, from UI to database. Simulate real user behavior.

Tools

Selenium: Browser automation (oldest, most mature)
Cypress: Modern, fast, developer-friendly (JavaScript)
Playwright: Cross-browser, reliable (Microsoft)
Puppeteer: Chrome/Chromium automation (Google)

Cypress Example:

// cypress/integration/user_registration.spec.js
describe('User Registration Flow', () => {
    beforeEach(() => {
        // Setup: Visit registration page
        cy.visit('http://localhost:3000/register');
    });

    it('should successfully register new user', () => {
        // Fill out form
        cy.get('input[name="name"]').type('Alice Smith');
        cy.get('input[name="email"]').type('alice@example.com');
        cy.get('input[name="password"]').type('SecurePassword123');
        cy.get('input[name="confirmPassword"]').type('SecurePassword123');

        // Submit form
        cy.get('button[type="submit"]').click();

        // Verify success
        cy.url().should('include', '/dashboard');
        cy.contains('Welcome, Alice Smith').should('be.visible');

        // Verify user can see dashboard content
        cy.get('.dashboard-menu').should('be.visible');
    });

    it('should show error for invalid email', () => {
        cy.get('input[name="email"]').type('invalid-email');
        cy.get('input[name="password"]').type('password123');
        cy.get('button[type="submit"]').click();

        cy.contains('Invalid email address').should('be.visible');
        cy.url().should('include', '/register'); // Still on registration page
    });

    it('should show error when passwords do not match', () => {
        cy.get('input[name="password"]').type('password123');
        cy.get('input[name="confirmPassword"]').type('different');
        cy.get('button[type="submit"]').click();

        cy.contains('Passwords do not match').should('be.visible');
    });
});

Playwright Example:

// tests/e2e/checkout.spec.ts
import { test, expect } from '@playwright/test';

test.describe('E-commerce Checkout Flow', () => {
    test('complete purchase flow', async ({ page }) => {
        // 1. Browse products
        await page.goto('http://localhost:3000');
        await expect(page.locator('h1')).toContainText('Products');

        // 2. Add item to cart
        await page.click('text=Add to Cart >> nth=0');
        await expect(page.locator('.cart-count')).toHaveText('1');

        // 3. Go to cart
        await page.click('text=Cart');
        await expect(page.locator('.cart-item')).toHaveCount(1);

        // 4. Proceed to checkout
        await page.click('text=Checkout');

        // 5. Fill shipping info
        await page.fill('input[name="address"]', '123 Main St');
        await page.fill('input[name="city"]', 'San Francisco');
        await page.fill('input[name="zip"]', '94102');

        // 6. Enter payment
        await page.fill('input[name="cardNumber"]', '4111111111111111');
        await page.fill('input[name="expiry"]', '12/25');
        await page.fill('input[name="cvv"]', '123');

        // 7. Complete order
        await page.click('text=Place Order');

        // 8. Verify confirmation
        await expect(page.locator('.order-confirmation')).toBeVisible();
        await expect(page.locator('.order-number')).toContainText(/ORD-\d+/);
    });
});

Trade-offs

Pros:

Test real user workflows
Catch integration issues unit tests miss
Confidence in production readiness

Cons:

Slow: Minutes instead of milliseconds
Flaky: Network issues, timing problems, UI changes
Expensive: Hard to write and maintain
Late feedback: Find bugs late in development

Best Practices

Test critical user journeys only (login, checkout, signup)
Use data-testid attributes instead of CSS selectors
Avoid testing implementation details
Use explicit waits, not arbitrary sleeps
Run in CI/CD pipeline but not on every commit

Test-Driven Development (TDD)

The Red-Green-Refactor Cycle

RED: Write a failing test

- Write minimal test that fails

- Run test, verify it fails for right reason

GREEN: Make it pass

- Write minimal code to pass test

- Don't worry about perfection yet

REFACTOR: Improve the code

- Clean up code while tests still pass

- Remove duplication, improve design

REPEAT

TDD Example: Building a Stack

# Iteration 1: RED
def test_new_stack_is_empty():
    stack = Stack()
    assert stack.is_empty() == True
# Run test → FAILS (Stack doesn't exist)

# GREEN: Minimal implementation
class Stack:
    def is_empty(self):
        return True
# Run test → PASSES

# REFACTOR: (nothing to refactor yet)


# Iteration 2: RED
def test_push_adds_item_to_stack():
    stack = Stack()
    stack.push(1)
    assert stack.is_empty() == False
# Run test → FAILS

# GREEN: Make it pass
class Stack:
    def __init__(self):
        self.items = []

    def push(self, item):
        self.items.append(item)

    def is_empty(self):
        return len(self.items) == 0
# Run test → PASSES

# REFACTOR: (looks good)


# Iteration 3: RED
def test_pop_returns_last_item_pushed():
    stack = Stack()
    stack.push(1)
    stack.push(2)
    assert stack.pop() == 2
# Run test → FAILS

# GREEN
class Stack:
    def __init__(self):
        self.items = []

    def push(self, item):
        self.items.append(item)

    def pop(self):
        return self.items.pop()

    def is_empty(self):
        return len(self.items) == 0
# Run test → PASSES


# Iteration 4: RED
def test_pop_on_empty_stack_raises_error():
    stack = Stack()
    with pytest.raises(IndexError):
        stack.pop()
# Run test → FAILS (pop doesn't raise IndexError)

# GREEN
class Stack:
    def __init__(self):
        self.items = []

    def push(self, item):
        self.items.append(item)

    def pop(self):
        if self.is_empty():
            raise IndexError("Pop from empty stack")
        return self.items.pop()

    def is_empty(self):
        return len(self.items) == 0
# Run test → PASSES

Benefits of TDD

Better design: Forces you to think about interface before implementation
Confidence: Every line of code has a test
Less debugging: Catch bugs immediately
Documentation: Tests show how code should be used
Prevents over-engineering: Write only what's needed

Challenges

Slower initial development (faster overall with less debugging)
Requires discipline and practice
Can be difficult with legacy code
Not suitable for all situations (exploratory coding, prototypes)

Behavior-Driven Development (BDD)

What is BDD?

Extension of TDD that focuses on behavior from user's perspective. Uses natural language (Given-When-Then) to describe tests.

Given-When-Then Format

Given: Initial context/setup
When: Action/event
Then: Expected outcome

Gherkin (Cucumber) Syntax:

Feature: User Login
    As a registered user
    I want to log in to my account
    So that I can access my dashboard

    Scenario: Successful login with valid credentials
        Given I am on the login page
        And I have a registered account with email "alice@example.com"
        When I enter email "alice@example.com"
        And I enter password "SecurePassword123"
        And I click the "Login" button
        Then I should be redirected to the dashboard
        And I should see "Welcome, Alice"

    Scenario: Failed login with invalid password
        Given I am on the login page
        When I enter email "alice@example.com"
        And I enter password "WrongPassword"
        And I click the "Login" button
        Then I should see an error message "Invalid credentials"
        And I should remain on the login page

    Scenario: Failed login with unregistered email
        Given I am on the login page
        When I enter email "nonexistent@example.com"
        And I enter password "SomePassword"
        And I click the "Login" button
        Then I should see an error message "User not found"

Python BDD (pytest-bdd):

# features/login.feature (same as above)

# tests/step_defs/test_login.py
from pytest_bdd import scenarios, given, when, then, parsers

scenarios('../features/login.feature')

@given('I am on the login page')
def on_login_page(browser):
    browser.visit('/login')

@given(parsers.parse('I have a registered account with email "{email}"'))
def registered_user(email, database):
    database.create_user(email=email, password="SecurePassword123")

@when(parsers.parse('I enter email "{email}"'))
def enter_email(email, browser):
    browser.fill('email', email)

@when(parsers.parse('I enter password "{password}"'))
def enter_password(password, browser):
    browser.fill('password', password)

@when(parsers.parse('I click the "{button}" button'))
def click_button(button, browser):
    browser.find_by_text(button).click()

@then('I should be redirected to the dashboard')
def on_dashboard(browser):
    assert browser.url.endswith('/dashboard')

@then(parsers.parse('I should see "{text}"'))
def see_text(text, browser):
    assert browser.is_text_present(text)

TDD vs BDD

Aspect	TDD	BDD
Focus	Implementation correctness	User behavior and business value
Language	Technical (code)	Natural language (Given-When-Then)
Audience	Developers	Developers, QA, Product Owners, Business
Scope	Usually unit level	Usually feature/integration level
Goal	Drive design, ensure correctness	Shared understanding, living documentation

Test Coverage

What is Code Coverage?

Percentage of code executed during tests. Measured by:

Line coverage: % of lines executed
Branch coverage: % of decision branches taken
Function coverage: % of functions called
Statement coverage: % of statements executed

Python Coverage Example:

# Install: pip install pytest-cov

# Run tests with coverage
pytest --cov=myapp tests/

# Output:
# Name                Stmts   Miss  Cover
# ---------------------------------------
# myapp/__init__.py       2      0   100%
# myapp/calculator.py    15      2    87%
# myapp/user.py          42      8    81%
# ---------------------------------------
# TOTAL                  59     10    83%

# Generate HTML report
pytest --cov=myapp --cov-report=html tests/

# View in browser: htmlcov/index.html
# Shows exactly which lines aren't covered

Coverage Misconceptions

100% coverage = bug-free code

Coverage measures lines executed, not correctness. You can have 100% coverage with terrible tests.

Example of 100% coverage with bad tests:

def divide(a, b):
    return a / b  # Bug: No zero check!

def test_divide():
    divide(10, 2)  # 100% coverage, but doesn't test edge cases!
    # Doesn't test: divide by zero, negative numbers, floats, etc.

Better approach:

Aim for 70-90% coverage (sweet spot)
Focus on critical paths, edge cases
Coverage is indicator, not goal
100% coverage on new code is reasonable

Testing Best Practices

FIRST Principles

Principle	Meaning
Fast	Tests should run quickly (milliseconds for unit tests)
Isolated	Tests don't depend on each other, can run in any order
Repeatable	Same result every time (no randomness, no external dependencies)
Self-validating	Pass/fail automatically, no manual checking
Timely	Written before or with code, not after

Arrange-Act-Assert (AAA Pattern)

def test_user_can_update_profile():
    # Arrange: Set up test data and dependencies
    user = User(name="Alice", email="alice@example.com")
    user_service = UserService(database=FakeDatabase())

    # Act: Execute the behavior being tested
    result = user_service.update_profile(user, name="Alicia")

    # Assert: Verify expected outcome
    assert result.success == True
    assert user.name == "Alicia"
    assert user.email == "alice@example.com"  # Unchanged

Test Naming

Good test names describe what's being tested and expected behavior.

# Bad names
def test1():
def test_user():
def test_error():

# Good names (descriptive)
def test_user_registration_with_valid_email_succeeds():
def test_withdraw_more_than_balance_raises_insufficient_funds_error():
def test_expired_auth_token_returns_401_unauthorized():

# Alternative format: test_[method]_[scenario]_[expected_result]
def test_divide_by_zero_raises_value_error():
def test_sort_empty_list_returns_empty_list():

# BDD style
def test_given_invalid_email_when_registering_then_validation_error_raised():

What to Test

DO test:

Business logic and algorithms
Edge cases (null, empty, boundary values)
Error handling
Critical user paths
Complex conditionals

DON'T test:

Framework code (e.g., Django ORM, React)
Third-party libraries
Getters/setters with no logic
Simple constructors
Private methods (test through public interface)

One Assert Per Test?

Guideline: One logical concept per test, not necessarily one assert.

# Good: Multiple asserts, one logical concept
def test_user_creation_sets_all_fields_correctly():
    user = create_user(name="Alice", email="alice@example.com", age=30)

    assert user.name == "Alice"
    assert user.email == "alice@example.com"
    assert user.age == 30
    # All asserts verify the same concept: user creation

# Bad: Multiple concepts in one test
def test_user_operations():
    user = create_user("Alice")  # Concept 1
    user.update_email("new@example.com")  # Concept 2
    user.delete()  # Concept 3
    # Split into 3 separate tests

Common Anti-Patterns

1. Testing Implementation Details

# Bad: Tests internal implementation
def test_user_service_calls_repository_save():
    mock_repo = Mock()
    service = UserService(mock_repo)

    service.create_user("Alice")

    mock_repo.save.assert_called_once()  # Testing HOW, not WHAT

# Good: Test behavior/outcome
def test_user_service_creates_user_successfully():
    repo = FakeUserRepository()
    service = UserService(repo)

    user = service.create_user("Alice")

    assert user.name == "Alice"
    assert repo.find_by_name("Alice") is not None  # Verify outcome

2. Fragile Tests (Tightly Coupled)

# Bad: Fragile CSS selectors
cy.get('.css-14a8v3k > div:nth-child(2) > button').click()

# Good: Semantic selectors
cy.get('[data-testid="submit-button"]').click()

3. Test Interdependence

# Bad: Tests depend on execution order
shared_user = None

def test_create_user():
    global shared_user
    shared_user = create_user("Alice")

def test_update_user():  # Depends on test_create_user!
    shared_user.update_email("new@example.com")

# Good: Each test independent
def test_create_user():
    user = create_user("Alice")
    assert user.name == "Alice"

def test_update_user():
    user = create_user("Bob")  # Create fresh user
    user.update_email("new@example.com")
    assert user.email == "new@example.com"

4. Ignoring Test Failures

# Bad: Commenting out failing tests
# def test_payment_processing():
#     # TODO: Fix this test later
#     pass

# Good: Fix or delete the test, don't ignore

Key Takeaways

Test pyramid: Many unit tests, some integration, few E2E
Unit tests: Fast, isolated, test single units
Integration tests: Test component interactions
E2E tests: Test critical user journeys through UI
TDD: Write test first (Red-Green-Refactor)
BDD: Focus on behavior using natural language
Coverage: Aim for 70-90%, not 100%
FIRST principles: Fast, Isolated, Repeatable, Self-validating, Timely
AAA pattern: Arrange, Act, Assert
Test behavior, not implementation

Interview Tips

When discussing testing:

Explain the testing pyramid and why it matters
Give examples of tests you've written
Discuss trade-offs (speed vs confidence, cost vs coverage)
Mention specific tools you've used (pytest, Jest, Cypress, etc.)
Talk about TDD if you practice it
Be honest about coverage targets (70-90% is realistic)
Discuss how testing influenced design
Mention CI/CD integration

← Back to Home