← Back to Blog

k6 Is Easy to Run — and Easy to Misuse

Modern engineering teams talk a lot about performance. Fewer teams actually own it.

Tools like k6 make it deceptively simple to spin up a load test, generate traffic, and produce charts that look reassuring. A single command, a JavaScript file, and suddenly you have “performance coverage.”

That ease is both k6’s strength — and its biggest trap.

This post isn’t about how to write a k6 script. It’s about what it signals when you choose k6, how you use it, and what maturity looks like beyond the hello-world test.

Simplicity of k6

k6 lowers the barrier to entry in all the right ways:

  • JavaScript-based scripting most engineers can read
  • Clear local execution model
  • First-class CI integration
  • Honest, readable metrics output

This makes k6 an excellent entry point into performance testing.

But here’s the uncomfortable truth:

Most teams stop exactly where k6 becomes dangerous.

They write a single script, run it in CI, set a few thresholds, and declare performance “handled.”

It isn’t.

Performance Testing Is Not Traffic Generation

Running k6 without a performance strategy usually leads to three anti-patterns.

1. Synthetic Confidence

A green k6 report is not proof your system will survive production traffic.

  • Real traffic is bursty, not linear
  • Real users don’t wait politely between requests
  • Real failures cascade across systems

If your k6 test doesn’t model behavior, it’s just a traffic generator.

2. Threshold Worship

    http_req_duration: ['p(95)<500']

Thresholds only matter when they’re derived from:

  • User experience expectations
  • Business SLAs
  • Historical system baselines

Otherwise, they’re just numbers that reward mediocrity.

3. The CI-as-a-Performance-Gate Fallacy

Putting k6 in CI feels mature.

In practice, performance tests are:

  • Environment-sensitive
  • Naturally noisy
  • Expensive to run frequently

Failing builds on every minor latency fluctuation trains teams to disable tests, not improve systems.

From “Hello World” to Intentional Load

Most teams start — and stop — here:

    import http from 'k6/http';
    import { sleep } from 'k6';

    export default function () {
        http.get('https://api.example.com/health');
        sleep(1);
    }

This proves one thing only:

Your API responds when politely asked once per second.

It sends no signal about real system behavior.

Scenario-Driven Load (The First Maturity Step)

Contrast that with a scenario-based model:

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  scenarios: {
    checkout_peak: {
      executor: 'ramping-arrival-rate',
      startRate: 10,
      timeUnit: '1s',
      preAllocatedVUs: 50,
      maxVUs: 200,
      stages: [
        { target: 50, duration: '2m' },
        { target: 50, duration: '5m' },
        { target: 0, duration: '1m' },
      ],
    },
  },
};

export default function () {
  const res = http.post(
    'https://api.example.com/checkout',
    JSON.stringify({ itemId: 123 }),
    { headers: { 'Content-Type': 'application/json' } }
  );

  check(res, {
    'status is 200': (r) => r.status === 200,
    'response < 800ms': (r) => r.timings.duration < 800,
  });

  sleep(Math.random() * 2);
}

This signals:

  • You understand arrival rate vs virtual users
  • You model business events, not endpoints
  • Latency expectations are explicit and contextual

This is where k6 stops being a demo tool and becomes engineering feedback.

Thresholds as Contracts, Not Wishes

Weak Signal

thresholds: {
  http_req_duration: ['p(95)<500'],
}

Stronger Signal

  export const options = {
  thresholds: {
    http_req_duration: [
      'p(95)<800',
      'p(99)<1500',
    ],
    http_req_failed: ['rate<0.01'],
  },
};

Even better example

// p95 < 800ms aligns with checkout UX expectations
// p99 < 1500ms avoids upstream payment timeouts

Number without reasons are just guesses

Separating Load From Observation

Senior teams don’t debug performance inside k6.. They build correlation

import { Trend } from 'k6/metrics';

const checkoutLatency = new Trend('checkout_latency');

export default function () {
  const res = http.post('https://api.example.com/checkout');
  checkoutLatency.add(res.timings.duration);
}

k6 measures pressure. APM tools explain behavior.

Treating k6 as an observability platform is a category error.

Where k6 fir and it doesn’t

✔ Use k6 for:
  - Nightly performance runs
  - Pre-release validation
  - Architecture comparisons
  - Capacity learning

✘ Avoid using k6:
  - On every pull request
  - As a hard CI gate
  - Against unstable environments

The Signal k6 Sends (When Used Well)

  • You value feedback loops over vanity metrics
  • You understand the difference between traffic and behavior
  • You treat performance as a design concern
  • You know tools are leverage — not guarantees

Ironically, the strongest signal isn’t having k6 tests. It’s knowing when not to rely on them.

k6 is a sharp tool. Sharp tools reward discipline and punish shortcuts.

If your performance strategy begins and ends with a k6 script, you’re optimizing for comfort — not resilience.

Used intentionally, k6 becomes exactly what it should be:

A fast, honest way to learn how your system behaves under pressure.