Testing & Mocking

Load Testing APIs: k6, Locust, and Artillery

How to load test your APIs and validate status codes under pressure — test script design, ramp-up patterns, threshold assertions, and interpreting results.

Load Testing Goals

Load testing answers questions that no unit or integration test can:

  • What is the maximum throughput before error rates climb?
  • Does the API return 503s or just slow down under overload?
  • Are there memory leaks that surface only after 10 minutes of sustained load?
  • Does the 99th percentile latency stay within SLO under normal traffic?

Load testing should be a regular part of your CI pipeline, not a one-off exercise before a major launch. Regressions in performance are just as real as regressions in correctness.

k6

k6 is a Go-based load testing tool with a JavaScript scripting API. It is fast, has low overhead per virtual user, and integrates well with CI pipelines.

Basic Script

// load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  vus: 50,          // 50 virtual users
  duration: '2m',   // run for 2 minutes
  thresholds: {
    http_req_failed: ['rate<0.01'],          // <1% error rate
    http_req_duration: ['p(95)<500'],         // 95th percentile < 500ms
    'http_req_duration{status:200}': ['p(99)<1000'],
  },
};

export default function () {
  const res = http.get('https://api.example.com/users');
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 200ms': (r) => r.timings.duration < 200,
    'has results array': (r) => r.json('results') !== undefined,
  });
  sleep(1);
}
k6 run load-test.js

Ramp-Up Patterns

Real traffic doesn't start at full load. Use stages to ramp up gradually:

export const options = {
  stages: [
    { duration: '30s', target: 10 },   // ramp up to 10 VUs
    { duration: '1m',  target: 50 },   // ramp up to 50 VUs
    { duration: '2m',  target: 50 },   // hold at 50 VUs
    { duration: '30s', target: 0  },   // ramp down
  ],
};

Checking Status Codes

import { check } from 'k6';
import { Rate } from 'k6/metrics';

const errorRate = new Rate('errors');

export default function () {
  const res = http.post('https://api.example.com/orders', JSON.stringify({
    item_id: 1, qty: 2
  }), { headers: { 'Content-Type': 'application/json' } });

  const ok = check(res, {
    'status is 201': (r) => r.status === 201,
    'not a 429': (r) => r.status !== 429,
    'not a 500': (r) => r.status !== 500,
  });
  errorRate.add(!ok);
}

Running k6 in CI

# GitHub Actions
- name: Load test
  uses: grafana/[email protected]
  with:
    filename: tests/load-test.js
  env:
    BASE_URL: https://staging.example.com

Locust

Locust is a Python load testing tool. Its test scripts are plain Python classes, making it easy to integrate with your existing Python codebase and test fixtures.

# locustfile.py
from locust import HttpUser, task, between

class APIUser(HttpUser):
    wait_time = between(1, 3)  # think time between requests

    def on_start(self):
        """Called once per user when they start."""
        response = self.client.post('/auth/token', json={
            'username': '[email protected]',
            'password': 'testpassword'
        })
        self.token = response.json()['access_token']

    @task(3)  # weight: called 3x more often than other tasks
    def list_users(self):
        with self.client.get(
            '/api/users',
            headers={'Authorization': f'Bearer {self.token}'},
            catch_response=True
        ) as response:
            if response.status_code != 200:
                response.failure(f'Expected 200, got {response.status_code}')

    @task(1)
    def create_order(self):
        with self.client.post(
            '/api/orders',
            json={'item_id': 1, 'qty': 1},
            headers={'Authorization': f'Bearer {self.token}'},
            catch_response=True
        ) as response:
            if response.status_code not in (200, 201):
                response.failure(f'Unexpected status: {response.status_code}')
# Headless mode (CI)
locust -f locustfile.py --headless -u 100 -r 10 --run-time 2m \
  --host https://staging.example.com \
  --only-summary

# With web UI
locust -f locustfile.py
# open http://localhost:8089

Distributed Locust

For very high load, run Locust in distributed mode:

# Master node
locust -f locustfile.py --master --expect-workers 4

# Worker nodes (each on a separate machine)
locust -f locustfile.py --worker --master-host=<master-ip>

Artillery

Artillery uses YAML configuration for test scenarios, making it easy to review in pull requests and share with non-developers.

# load-test.yml
config:
  target: https://api.example.com
  phases:
    - duration: 60
      arrivalRate: 10
      name: Warm up
    - duration: 120
      arrivalRate: 50
      name: Sustained load
  plugins:
    expect: {}

scenarios:
  - name: API smoke test
    flow:
      - post:
          url: /auth/token
          json:
            username: [email protected]
            password: testpassword
          capture:
            json: $.access_token
            as: token
          expect:
            - statusCode: 200
      - get:
          url: /api/users
          headers:
            Authorization: Bearer {{ token }}
          expect:
            - statusCode: 200
            - hasProperty: results
npx artillery run load-test.yml --output report.json
npx artillery report report.json

Interpreting Results

Key Metrics

MetricWhat it tells you
Requests/sec (RPS)Throughput ceiling
Error rate by statusWhere failures occur (4xx vs 5xx)
p50 latencyTypical user experience
p95/p99 latencyWorst-case experience for 5%/1% of users
Max concurrencyAt what point does the system degrade?

Reading Error Patterns

  • Climbing 429s: You are hitting rate limits — adjust load or request throttling config
  • Sudden 502/503 spike: A downstream service is falling over
  • Gradual 500 increase: Memory leak or connection pool exhaustion — check heap/pool metrics
  • Latency spikes without errors: GC pauses, lock contention, or disk I/O saturation

Bottleneck Identification

Correlate load test results with server metrics:

# During a load test, watch these in parallel:
# CPU: top, htop
# Connections: ss -s
# DB: pg_stat_activity (max active connections)
# Memory: free -h
# Logs: tail -f /var/log/app/error.log | grep -E '50[0-9]'

A load test that produces no 5xx errors but shows 90% CPU saturation tells you to scale horizontally. One that shows low CPU but 503s tells you connection limits or queues are full.

Связанные протоколы

Связанные термины глоссария

Больше в Testing & Mocking