HTTP Recording and Replay: VCR, Polly, and Nock

The Record/Replay Pattern

HTTP recording and replay solves a real problem: your code makes HTTP requests to external APIs (Stripe, Twilio, GitHub, etc.), but those APIs are expensive, slow, rate-limited, or unavailable in CI. You can mock them manually, but that requires maintaining fake responses that may drift from reality.

The record/replay pattern takes a different approach:

First run (recording): Tests run against the real API. All HTTP interactions are captured to a cassette file (YAML or JSON).
Subsequent runs (replay): Tests use the cassette instead of the network. Responses are deterministic, instant, and free.

This gives you realistic test data without the cost, and your cassettes live in version control so everyone on the team uses the same fixtures.

vcrpy (Python)

vcrpy is the Python implementation of the VCR pattern. It intercepts HTTP requests made by requests, httpx, urllib3, and other common libraries.

pip install vcrpy

Basic Usage

import vcr
import requests

def test_github_user_api():
    with vcr.VCR().use_cassette('tests/cassettes/github_user.yaml'):
        response = requests.get('https://api.github.com/users/octocat')
        assert response.status_code == 200
        assert response.json()['login'] == 'octocat'

The first time this test runs, vcrpy makes the real HTTP request and saves the interaction to github_user.yaml. On subsequent runs, it reads from the cassette without hitting the network.

pytest-recording

A pytest plugin that makes vcrpy even easier:

pip install pytest-recording

import pytest
import requests

@pytest.mark.vcr  # Uses cassette named after the test function
def test_stripe_charge():
    response = requests.post(
        'https://api.stripe.com/v1/charges',
        headers={'Authorization': 'Bearer sk_test_xxx'},
        data={'amount': 1000, 'currency': 'usd', 'source': 'tok_visa'}
    )
    assert response.status_code == 200
    assert response.json()['status'] == 'succeeded'

# Record cassettes for the first time
pytest --record-mode=new_episodes tests/test_payments.py

# Re-record all cassettes
pytest --record-mode=all tests/test_payments.py

# Normal test run (uses cassettes, no network)
pytest tests/test_payments.py

Request Matching

By default vcrpy matches on URI and method. You can customize this:

import vcr

my_vcr = vcr.VCR(
    record_mode='new_episodes',
    match_on=['method', 'scheme', 'host', 'port', 'path', 'query'],
    # Ignore request body differences (useful for dynamic timestamps)
    before_record_request=lambda r: r,
    before_record_response=lambda r: r,
)

Nock (Node.js)

Nock intercepts HTTP requests made by the http and https Node.js modules. Unlike VCR, Nock stubs are defined in code rather than recorded from real responses.

const nock = require('nock');
const { fetchUser } = require('../lib/api');

beforeEach(() => {
  nock.cleanAll();
});

test('fetchUser returns user data on 200', async () => {
  nock('https://api.example.com')
    .get('/users/42')
    .reply(200, { id: 42, name: 'Alice', email: '[email protected]' });

  const user = await fetchUser(42);
  expect(user.name).toBe('Alice');
});

test('fetchUser throws on 404', async () => {
  nock('https://api.example.com')
    .get('/users/999')
    .reply(404, { error: 'User not found' });

  await expect(fetchUser(999)).rejects.toThrow('User not found');
});

test('fetchUser retries on 503', async () => {
  nock('https://api.example.com')
    .get('/users/42')
    .reply(503)  // first call fails
    .get('/users/42')
    .reply(200, { id: 42, name: 'Alice' });  // retry succeeds

  const user = await fetchUser(42);
  expect(user.id).toBe(42);
});

Polly.js

Polly.js is a higher-level recording/replay library for JavaScript that supports both Node.js (via node-http) and browsers (via Service Worker or XHR adapter).

const { Polly } = require('@pollyjs/core');
const NodeHttpAdapter = require('@pollyjs/adapter-node-http');
const FSPersister = require('@pollyjs/persister-fs');

Polly.register(NodeHttpAdapter);
Polly.register(FSPersister);

describe('GitHub API client', () => {
  let polly;

  beforeEach(() => {
    polly = new Polly('GitHub API', {
      adapters: ['node-http'],
      persister: 'fs',
      persisterOptions: { fs: { recordingsDir: '__recordings__' } },
    });
  });

  afterEach(async () => { await polly.stop(); });

  test('fetches repository info', async () => {
    const { server } = polly;
    // Intercept and customize if needed
    server.get('https://api.github.com/repos/octocat/hello-world')
      .passthrough();

    const response = await fetch('https://api.github.com/repos/octocat/hello-world');
    expect(response.status).toBe(200);
  });
});

Cassette Management

Committing Cassettes

Commit cassettes to version control so all developers and CI use the same recorded data. Add cassette directories to your git repo but ensure sensitive values are scrubbed:

# Scrub sensitive headers before saving
import vcr

def scrub_auth_headers(request):
    request.headers.pop('Authorization', None)
    request.headers.pop('X-API-Key', None)
    return request

def scrub_response_secrets(response):
    # Remove tokens from recorded responses
    import re
    body = response['body']['string']
    if isinstance(body, bytes):
        body = body.decode()
    body = re.sub(r'sk_live_[a-zA-Z0-9]+', 'sk_live_REDACTED', body)
    response['body']['string'] = body.encode()
    return response

my_vcr = vcr.VCR(
    before_record_request=scrub_auth_headers,
    before_record_response=scrub_response_secrets,
)

Cassette Expiration

Cassettes can drift from the real API over time. Establish a refresh policy:

# Re-record cassettes that are older than 30 days (CI cron job)
find tests/cassettes -name '*.yaml' -mtime +30 -exec rm {} \;
pytest --record-mode=new_episodes tests/
git add tests/cassettes && git commit -m 'chore: refresh cassettes'

Pitfalls

Cassette drift: The real API changes but the cassette is stale. Set cassette expiry dates.
Over-mocking: If you mock too much, you are testing your mock rather than real behavior.
Cassette bloat: Recording large responses (file downloads, paginated lists) can make cassette files unwieldy. Filter or paginate deliberately.
Non-deterministic requests: If your code generates timestamps or UUIDs in request bodies, request matching will fail. Use match_on to exclude the body, or normalize it.