Mocking and Stubbing: Building Test Doubles with Real-World Analogies

When writing automated tests, one of the trickiest challenges is isolating the piece of code you want to test from its dependencies—databases, web services, file systems, or other classes. Without isolation, a test might fail because of a network outage, not because your code has a bug. That's where test doubles come in. This guide explains mocking and stubbing through relatable real-world analogies, helping you understand not just how to build them, but why they work and when to use each type.

We'll define the core concepts, walk through a practical workflow, compare popular tools, and highlight common mistakes. By the end, you'll have a solid framework for writing tests that are fast, reliable, and maintainable.

Why Test Doubles Matter: The Problem of Dependencies

The Real-World Analogy: Movie Stunt Doubles

Imagine you're filming a scene where the hero jumps from a moving car. You wouldn't ask the lead actor to perform that dangerous stunt—you'd hire a stunt double. The stunt double looks similar enough to fool the camera, but they're specialized for the risky action. In testing, your code under test is the lead actor, and its dependencies (like a database or API) are the dangerous stunts. A test double steps in to perform the dependency's role safely and predictably.

Why Dependencies Make Testing Hard

Dependencies introduce uncertainty. A test that calls a real database might fail because the database is down, data has changed, or network latency causes a timeout. These failures are not about your code's correctness—they're environmental. Moreover, dependencies often make tests slow. A test that hits a real API might take seconds, while a well-isolated test runs in milliseconds. In a large suite, that speed difference adds up, discouraging developers from running tests frequently.

What Test Doubles Achieve

Test doubles replace real dependencies with lightweight substitutes that you control. They let you:

Isolate failures: A test fails only if your code has a bug, not because of external factors.
Control scenarios: You can simulate edge cases like network errors, empty responses, or timeouts that are hard to trigger with real components.
Speed up tests: Doubles eliminate I/O and network calls, making tests fast enough to run on every commit.
Simplify setup: You don't need to configure a database or mock server for every test.

Without test doubles, you're essentially testing the whole system in an integration style—which is valuable but should be a separate layer, not your primary unit test approach.

Core Concepts: Stubs, Mocks, Fakes, and Spies

Analogy: The Flight Simulator

A flight simulator is a fake airplane. It mimics the controls and instruments so a pilot can practice without leaving the ground. But not all simulators are the same: some only provide instrument readings (a stub), while others verify that the pilot flips the right switches in order (a mock). This analogy helps distinguish the different types of test doubles.

Stubs: Providing Predefined Answers

A stub is a test double that returns fixed values when its methods are called. You use a stub when you need your code to receive a specific response from a dependency, but you don't care how many times it's called or in what order. For example, if your function reads a user's name from a database, a stub might return 'Alice' every time. Stubs are the simplest form of test double.

Mocks: Verifying Interactions

A mock is a test double that records which methods were called and with what arguments, then asserts that the expected interactions happened. You use a mock when you need to verify that your code does something—for instance, that it sends an email or writes a log entry. Mocks go beyond stubs by checking behavior, not just providing data.

Fakes: Lightweight Implementations

A fake is a working but simplified implementation of a dependency. An in-memory database is a classic example: it supports the same operations as a real database but stores data in memory, so it's fast and doesn't require external infrastructure. Fakes are more complex to build but can be reused across many tests.

Spies: Observing Without Stubbing

A spy wraps a real object and records information about calls made to it. Unlike a mock, a spy doesn't replace the real behavior unless you explicitly stub it. Spies are useful when you want to test that a method was called on a real object without altering its behavior.

Comparison Table

Type	Purpose	Verification	Example
Stub	Provide canned responses	None (data only)	Database query returns fixed user
Mock	Verify interactions	Assert method called with args	Email service was sent to '[email protected]'
Fake	Lightweight implementation	Optional (state-based)	In-memory repository
Spy	Record calls on real object	Assert call count or args	Logger recorded exactly one error

Building Test Doubles: A Step-by-Step Workflow

Step 1: Identify the Dependency Boundary

Before you build a double, decide which dependency to replace. Good candidates are external services (APIs, databases), I/O operations (file system, network), and non-deterministic components (random number generators, system clocks). Internal logic that is purely computational usually doesn't need doubling.

Step 2: Choose the Right Type of Double

Ask yourself: What am I testing? If you're testing that your code correctly processes data from a dependency, use a stub. If you're testing that your code triggers a side effect (like sending a notification), use a mock. If you need a reusable replacement for an entire subsystem, build a fake. If you want to monitor calls on a real object, use a spy.

Step 3: Define the Interface

Most test double frameworks work with interfaces or abstract classes. In statically typed languages (Java, C#), you typically mock an interface. In dynamically typed languages (Python, JavaScript), you can mock any object or function. Ensure your code depends on abstractions, not concrete implementations—this is the Dependency Inversion Principle at work.

Step 4: Configure the Double

Set up the double's behavior: what values to return, what exceptions to throw, and what arguments to expect. For mocks, define the expected call sequence. Most frameworks use a fluent API like:

when(mockDatabase.getUser(1)).thenReturn(user);

mockDatabase.getUser = lambda id: user

Step 5: Inject the Double

Replace the real dependency with your double. This is often done via constructor injection, setter injection, or a dependency injection container. Avoid creating doubles inside the test method using new—that makes the test brittle.

Step 6: Exercise and Verify

Run the code under test, then verify the results. For stubs, check the return value or state changes. For mocks, call the verification method (e.g., verify(mockEmailService).send(email)). For fakes, assert on the fake's internal state.

Tools and Frameworks: Choosing What Works for You

Popular Mocking Libraries

Every major language has at least one dominant mocking framework. Here's a quick comparison:

Language	Library	Strengths	Weaknesses
Java	Mockito	Simple API, readable syntax, integrates with JUnit	Cannot mock static methods (use PowerMock)
Python	unittest.mock	Built-in, powerful patching, supports spies	Can be verbose for complex scenarios
JavaScript	Jest	All-in-one (assertions, mocks, spies), zero config	Module mocking can be tricky with ESM
C#	Moq	Strong typing, LINQ-like setup, good for .NET	Limited to interfaces and virtual methods

When to Use a Framework vs. Hand-Rolled Doubles

Mocking frameworks save time and reduce boilerplate, but they can make tests harder to read when overused. For simple cases, a hand-rolled stub class might be clearer. For example, instead of mocking a repository interface in every test, you could create an in-memory FakeRepository that implements the same interface. This is especially useful when the same fake can be reused across many tests.

Maintenance Considerations

Test doubles add a maintenance burden: when the real dependency's interface changes, all doubles must be updated. To minimize this, keep your interfaces small and stable. Also, avoid mocking types you don't own (e.g., third-party libraries)—wrap them in your own abstraction and mock that instead. This reduces coupling to external changes.

Real-World Scenarios: When and How to Apply Test Doubles

Scenario 1: Testing an Order Service That Calls a Payment Gateway

You have an OrderService that processes payments via an external PaymentGateway. Testing with a real gateway would require a credit card and risk charging real money. Instead, you mock the gateway. The test verifies that when the order amount is valid, the service calls charge(amount) exactly once. If the gateway throws an exception, the service should catch it and mark the order as failed.

Scenario 2: Testing a Report Generator That Reads from a Database

Your ReportGenerator queries a database to produce a CSV file. Instead of setting up a test database with sample data, you stub the database repository to return a known list of records. The test then checks that the CSV output contains the correct headers and rows. This test runs in milliseconds and doesn't depend on database availability.

Scenario 3: Testing a Notification System That Sends Emails

A NotificationService sends emails when certain events occur. You want to verify that the correct email is sent to the right recipient. Use a mock for the email sender. After calling the service, assert that send(to, subject, body) was called with the expected arguments. This test ensures the notification logic works without actually sending emails.

Common Anti-Pattern: Mocking Everything

A frequent mistake is to mock every dependency, including simple data structures or value objects. This leads to brittle tests that break when you refactor internal implementation details. A good rule of thumb: mock only dependencies that cross a boundary (network, I/O, external system). For internal collaborators, let the real code run—it makes tests more robust.

Pitfalls and Mistakes: How to Avoid Brittle Tests

Over-Specifying Interactions

When you use mocks, it's tempting to verify every method call, including internal helper calls. This makes tests tightly coupled to the implementation. If you later refactor the code to use a different internal approach, the test breaks even though the external behavior is unchanged. Mitigation: only verify public interactions that represent side effects or outputs. Use stubs for intermediate data retrieval.

Mocking Types You Don't Own

Mocking a third-party library's concrete class can cause tests to fail when the library updates. Instead, create a thin wrapper interface around the library and mock that. This also makes it easier to swap libraries later.

Ignoring Test Double Lifecycle

If you create a mock in one test and reuse it in another without resetting, state can leak between tests. Most frameworks auto-reset mocks between tests, but if you're using hand-rolled fakes, ensure they start fresh. Use setup methods (e.g., @Before in JUnit, setUp in unittest) to create new doubles for each test.

Not Testing the Real Integration

Test doubles are great for unit tests, but they can give false confidence if you never test the real integration. Always have a separate layer of integration tests that exercise the actual dependencies (with test databases or sandbox APIs). This catches mismatches between your double's behavior and the real system.

Decision Checklist: When to Use Each Type of Double

Quick Reference

Use a stub when you need to provide data to the code under test and you don't care how many times it's called.
Use a mock when you need to verify that a specific interaction occurred (e.g., a method was called with certain arguments).
Use a fake when you need a lightweight, reusable implementation of a dependency that multiple tests can share.
Use a spy when you want to observe calls on a real object without changing its behavior.
Avoid doubles altogether when the dependency is a pure function or a simple data structure—let the real code run.

Common Questions

Q: Should I use mocks or stubs for database queries? A: Usually stubs, because you're testing how your code processes the returned data, not that the database was called. If you need to verify that a specific query was executed (e.g., for auditing), use a mock.

Q: Can I combine mocks and stubs in the same test? A: Yes, but be careful. A test that uses both is often testing too much. Consider splitting into separate tests: one for data processing (stub) and one for side effects (mock).

Q: How do I handle exceptions from dependencies? A: Stub the method to throw an exception, then test that your code handles it gracefully. For example, when(mockRepo.find(any())).thenThrow(new DatabaseException()).

Synthesis and Next Steps

Key Takeaways

Test doubles are indispensable for writing fast, reliable unit tests. By understanding the differences between stubs, mocks, fakes, and spies, and applying the right type to each situation, you can create a test suite that gives you confidence without slowing you down. Remember to mock at the boundaries, prefer stubs for data, and always back up your unit tests with integration tests.

Actionable Next Steps

Audit your current tests: Identify tests that are slow or flaky due to real dependencies. Replace those dependencies with test doubles.
Introduce a mocking framework if you haven't already. Start with simple stubs to replace external services.
Write a test for a legacy module that has no tests. Use mocks to isolate it from its dependencies, even if the code isn't designed for testability. This will expose areas where you can refactor to interfaces.
Set a team standard: Agree on when to use mocks vs. stubs vs. fakes. Consistency reduces confusion and makes tests easier to read.

Test doubles are a tool, not a goal. Use them judiciously, and always keep the reader of your tests—your future self and your teammates—in mind. A well-written test with appropriate doubles is a joy to maintain.

About the Author

This article was prepared by the editorial team for this publication. We focus on practical explanations and update articles when major practices change.

Last reviewed: May 2026

Table of Contents