Building Your Test Doubles: A Practical Guide to Mocking and Stubbing for Beginners

Every developer who writes automated tests eventually hits the same wall: the code you want to test talks to a database, calls an external API, or depends on a random number generator. Your test becomes slow, unpredictable, or impossible to run without a network connection. That's where test doubles come in. They let you replace real dependencies with controlled stand-ins, so you can test the logic itself without worrying about the environment.

In this guide, we'll walk through the four main types of test doubles—stubs, mocks, fakes, and spies—using concrete analogies and a realistic project scenario. We'll focus on the decisions beginners face: when to use each type, how to avoid brittle tests, and whether to use a mocking framework or write doubles by hand. By the end, you'll have a practical framework you can apply to your own codebase.

Why You Need Test Doubles and When to Reach for Them

Imagine you're testing a function that calculates a discount based on a user's loyalty points. The function calls a getLoyaltyPoints(userId) method that queries a database. In a unit test, you don't want to set up a real database and seed it with test data—that's slow, fragile, and makes your test a slow integration test. Instead, you replace that database call with a double that returns a fixed number of points. Now your test runs in milliseconds and never fails because the database was down.

The core idea is simple: isolate the unit under test from its dependencies. But the details matter. If you replace too much, your test might pass even when the real code would fail. If you replace too little, your test is still slow and brittle. The key is to double only the dependencies that make the test slow, non-deterministic, or hard to set up. That usually includes external services, file systems, clocks, random number generators, and databases.

Common Scenarios That Call for Doubles

Here are three everyday situations where test doubles save the day:

External API calls – Your code sends a request to a payment gateway. In a unit test, you replace the HTTP client with a stub that returns a fake response. This keeps your test fast and avoids charging real credit cards.
Randomness – A function that generates a random coupon code. Replace the random number generator with a stub that always returns the same value, so your test can assert the exact output.
Time-dependent logic – A function that checks if a subscription is expired. Replace the system clock with a fake that returns a fixed date, so you can test edge cases like the exact expiration moment.

Notice the pattern: you double anything that makes the test non-deterministic or slow. The goal is to make your tests reliable and fast, not to mock everything in sight.

The Four Types of Test Doubles: Stubs, Mocks, Fakes, and Spies

Test doubles come in four main flavors, each with a different job. Think of them like roles in a movie production. A stunt double (stub) replaces the actor for a dangerous scene and just follows instructions—it returns a fixed value when asked. A spy watches what happens and reports back—it records which methods were called and with what arguments. A fake is a simplified but working version of the real thing, like a lightweight in-memory database. A mock is both a stunt double and a spy with expectations—it knows ahead of time which calls should happen and fails the test if they don't.

Stubs: The Simple Stand-In

A stub is the simplest double. It returns a canned answer when a specific method is called. You use a stub when you need the dependency to just return a value so the code under test can proceed. For example, you stub getUserEmail(userId) to return [email protected] so you can test the email-sending logic. Stubs don't care how many times they're called or in what order—they just answer.

Mocks: Expectations and Verification

A mock is a stub with attitude. It comes with pre-programmed expectations about which methods should be called, how many times, and with what arguments. If the real code deviates from those expectations, the mock fails the test. This is useful when you want to verify that your code interacts with a dependency correctly—for instance, that it calls sendEmail() exactly once after a user signs up. Overusing mocks can make tests brittle, because they break whenever you change the internal implementation, even if the external behavior stays the same.

Fakes: Lightweight Replacements

A fake is a working implementation that is simpler or faster than the real thing. An in-memory database that implements the same interface as your production database is a classic fake. Fakes are great for integration-level tests where you need real behavior (like saving and querying data) but don't want the overhead of a real database. The risk is that your fake might not behave exactly like the real system, leading to false confidence.

Spies: Recording What Happened

A spy wraps a real object and records every call made to it. After the test runs, you can inspect the spy to see which methods were called, how many times, and with what arguments. Spies are less intrusive than mocks because they don't enforce expectations during the test—they just observe. You use them when you want to verify interactions without committing to a specific call pattern ahead of time.

How to Choose Between Stubs, Mocks, Fakes, and Spies

The choice depends on what you're testing and how much control you need. Here's a decision framework that works for most projects.

Start with Stubs

If you only need the dependency to return a value so your code can continue, use a stub. It's the simplest option and keeps your tests flexible. For example, if you're testing a function that formats a user's full name, stub the getFirstName() and getLastName() methods to return known strings. You don't care how many times they're called—you just need the data.

Use Mocks When Interaction Matters

Reach for a mock when the test must verify that your code called a specific method with the right arguments—for example, that after placing an order, the code called inventoryService.deductStock(itemId, quantity) exactly once. Mocks are also useful for testing that error handling works: you can mock a dependency to throw an exception and verify that your code handles it gracefully.

Prefer Fakes for Complex Logic

If the dependency has rich behavior (like a repository with CRUD operations), a fake often works better than a stub. Stubbing every method on a repository would be tedious and error-prone. A fake in-memory repository lets you test queries, updates, and deletions naturally. The trade-off is that you need to maintain the fake as the real interface evolves.

Spies for Observation Without Enforcement

Spies are ideal when you want to log interactions for later assertion but don't want to lock in expectations during the test. They're also handy for testing that a method was called at least once, without specifying an exact count. Many mocking frameworks offer spies as a built-in feature.

Trade-Offs Between Hand-Written Doubles and Mocking Frameworks

Beginners often wonder whether to write doubles by hand or use a mocking library like Mockito (Java), unittest.mock (Python), or Sinon.js (JavaScript). Both approaches have strengths and weaknesses.

Approach	Pros	Cons
Hand-written doubles	Full control; no magic; easy to debug; no library dependency	Verbose; tedious for large interfaces; must update manually when interface changes
Mocking framework	Fast to write; auto-generates stubs; built-in verification; less boilerplate	Can produce brittle tests; magic syntax confuses beginners; harder to debug
Hybrid (hand-written fakes + framework for mocks)	Best of both worlds: fakes for stable interfaces, mocks for external calls	Requires discipline; team must agree on conventions

For small projects or teams new to testing, hand-written doubles can be a good start. They make the mechanics explicit. As the codebase grows, a mocking framework reduces boilerplate and makes tests easier to read. The key is to avoid over-mocking—using a framework to mock everything, including simple data objects, which leads to tests that break on every refactor.

When Hand-Written Doubles Make Sense

Consider a composite scenario: a team is building a notification service that sends emails and SMS. The interface has two methods: sendEmail(to, subject, body) and sendSms(to, message). Writing a hand-written stub for this interface takes five minutes and is crystal clear—any developer can look at the stub and see what it does. The team decides to keep it hand-written because the interface is small and stable.

When a Mocking Framework Helps

Now imagine the same team adds a third-party analytics client with a dozen methods, each taking complex configuration objects. Writing hand-written stubs for all those methods would be a chore. The team switches to a mocking framework to generate stubs on the fly. They still write a hand-written fake for the database repository, because that interface is central and changes often.

Common Pitfalls That Make Tests Brittle

Even with the best intentions, beginners often fall into traps that turn test doubles from a help into a hindrance. Here are the most common ones and how to avoid them.

Over-Mocking

The biggest mistake is mocking everything, including simple data structures and value objects. If you mock a User class that only has getters, you're adding complexity for no benefit. Use real objects for data, and mock only dependencies with behavior or side effects.

Mocking Implementation Details

Tests that mock internal calls (like private methods or specific argument orders) break when you refactor the code, even if the external behavior stays the same. Instead, mock at the boundary of your system—the methods that cross a network boundary or touch a database. For example, mock the httpClient.post() call, not the internal buildRequestBody() method.

Strict Mocks That Fail on Unordered Calls

Some mocking frameworks default to strict mocks, where the order of method calls matters. This often leads to tests that fail because of a call order that doesn't affect correctness. Use loose mocks or verification phases to avoid this. Only enforce order when it's part of the specification (e.g., you must authenticate before sending a request).

Ignoring Teardown

If your test creates a fake database or mocks a global object, you must clean up after the test. Otherwise, state leaks between tests, causing mysterious failures. Use setup and teardown methods (like @Before and @After in JUnit) to reset doubles between tests.

Mini-FAQ: Answers to Common Beginner Questions

Q: Should I always use a mocking framework?
A: No. For simple interfaces, hand-written doubles are clearer and easier to debug. Use a framework when the interface is large or changes often, or when you need advanced features like argument matchers.

Q: How do I test that an exception was thrown by a dependency?
A: Use a mock or stub that throws the exception when the method is called. Then assert that your code handles it (e.g., catches it and returns a fallback value).

Q: My test passes with a mock but fails in production. What went wrong?
A: Your mock likely doesn't match the real dependency's behavior. Common issues: the mock returns a value that the real system would never return, or the mock doesn't simulate timing or network failures. Fakes can help here because they run real logic.

Q: Can I mix stubs and mocks in the same test?
A: Yes, but be careful. Use stubs for dependencies that just provide data, and mocks only for the one interaction you want to verify. Too many mocks in one test make it hard to understand.

Q: How do I test code that uses a random number generator?
A: Replace the random generator with a stub that returns a fixed sequence of numbers. Many languages have a seedable random class that you can inject.

Putting It All Together: A Practical Workflow

Here's a concrete plan you can apply to your next testing session.

Identify the unit under test – Choose a single function or method to test. Draw a boundary around it.
List its direct dependencies – Which objects or services does it call? Mark each as fast/deterministic (no double needed) or slow/non-deterministic (needs a double).
Choose the double type for each dependency – Use a stub for simple return values, a mock for interaction verification, a fake for complex behavior, and a spy for observation.
Write the double – Decide between hand-written or framework. Start simple; you can always refactor later.
Write the test – Arrange (set up doubles), Act (call the method), Assert (check results and/or interactions).
Run and refactor – If the test is brittle (breaks on unrelated changes), loosen the mock or switch to a stub.

This workflow keeps you focused on what matters: testing the logic, not the infrastructure. Over time, you'll develop an intuition for which dependencies need doubles and which can stay real.

Test doubles are a tool, not a goal. Use them to make your tests fast, reliable, and easy to write. When in doubt, start with the simplest double that works, and only add complexity when you need it. Your future self—and your teammates—will thank you.

Building Your Test Doubles: A Practical Guide to Mocking and Stubbing for Beginners

Table of Contents

Why You Need Test Doubles and When to Reach for Them

Common Scenarios That Call for Doubles

The Four Types of Test Doubles: Stubs, Mocks, Fakes, and Spies

Stubs: The Simple Stand-In

Mocks: Expectations and Verification

Fakes: Lightweight Replacements

Spies: Recording What Happened

How to Choose Between Stubs, Mocks, Fakes, and Spies

Start with Stubs

Use Mocks When Interaction Matters

Prefer Fakes for Complex Logic

Spies for Observation Without Enforcement

Trade-Offs Between Hand-Written Doubles and Mocking Frameworks

When Hand-Written Doubles Make Sense

When a Mocking Framework Helps

Common Pitfalls That Make Tests Brittle

Over-Mocking

Mocking Implementation Details

Strict Mocks That Fail on Unordered Calls

Ignoring Teardown

Mini-FAQ: Answers to Common Beginner Questions

Putting It All Together: A Practical Workflow

Comments (0)

Table of Contents

Why You Need Test Doubles and When to Reach for Them

Common Scenarios That Call for Doubles

The Four Types of Test Doubles: Stubs, Mocks, Fakes, and Spies

Stubs: The Simple Stand-In

Mocks: Expectations and Verification

Fakes: Lightweight Replacements

Spies: Recording What Happened

How to Choose Between Stubs, Mocks, Fakes, and Spies

Start with Stubs

Use Mocks When Interaction Matters

Prefer Fakes for Complex Logic

Spies for Observation Without Enforcement

Trade-Offs Between Hand-Written Doubles and Mocking Frameworks

When Hand-Written Doubles Make Sense

When a Mocking Framework Helps

Common Pitfalls That Make Tests Brittle

Over-Mocking

Mocking Implementation Details

Strict Mocks That Fail on Unordered Calls

Ignoring Teardown

Mini-FAQ: Answers to Common Beginner Questions

Putting It All Together: A Practical Workflow

Share this article:

Comments (0)

Related Articles

Mocking & Stubbing with Real-World Craft Tools: A Zencraft Guide

Mocking & Stubbing with Real-World Analogies: A Beginner’s Craft Guide

Mocking and Stubbing: Building Test Doubles with Real-World Analogies