pybench — Lightweight Python Benchmarking

Why pybench?

Everything you need to measure Python performance. Nothing you don't.

Zero Dependencies

Pure stdlib. Uses time.perf_counter_ns() and statistics. Optional rich for pretty output.

Two APIs

@benchmark decorator for functions. bench.measure() context manager for inline blocks. Use both together.

Auto‑Calibrating

Automatically determines optimal iteration count. Disables GC during measurement. Nanosecond precision.

CLI Included

pybench run discovers and executes benchmark files. Table or JSON output. Save results to disk.

Run Comparison

Compare benchmark runs over time. Detect regressions in CI. Color-coded percentage changes.

JSON Export

Machine-readable output with platform metadata. Perfect for CI/CD pipelines and tracking performance over time.

Installation

pybench requires Python 3.10 or later. Install from PyPI:

$ pip install pybench

For pretty terminal output with colored tables, install with the rich extra:

$ pip install pybench[rich]

API Guide

pybench provides two ways to define benchmarks: decorators for functions and context managers for inline blocks.

Decorator API

Decorate functions with @benchmark to register them for benchmarking. The function will be called repeatedly with automatic warmup and iteration calibration.

import pybench

bench = Bench()

@bench.benchmark
def sort_list():
    sorted([3, 1, 4, 1, 5, 9, 2, 6] * 1000)

@bench.benchmark(warmup=10, iterations=200)
def hash_string():
    hash("hello" * 1000)

bench.run()
bench.report()

Use @bench.benchmark without parentheses for default settings, or pass warmup and iterations to override.

Context Manager API

Use bench.measure() to time arbitrary blocks of code inline. Ideal for comparing different approaches side by side.

from pybench import Bench

bench = Bench()

with bench.measure("list comprehension"):
    [x ** 2 for x in range(10_000)]

with bench.measure("map + lambda"):
    list(map(lambda x: x ** 2, range(10_000)))

bench.report()

Mixing Both

Decorators and context managers work together on the same Bench instance. All results are collected and reported together.

from pybench import Bench

bench = Bench(warmup=5, iterations=100)

@bench.benchmark
def dict_merge():
    {**{"a": 1}, **{"b": 2}}

with bench.measure("dict union"):
    {"a": 1} | {"b": 2}

results = bench.run()

# Print a formatted table
bench.report()

# Or get JSON for CI
json_str = bench.to_json()

Sample Output

bench.report() prints a formatted table to stdout:

pybench results ──────────────────────────────────────────────────────────────────────── Name Mean Median StdDev Min Max Ops/sec ──────────────────────────────────────────────────────────────────────── sort_list 142.3 µs 141.8 µs 2.1 µs 138.9 µs 149.2 µs 7,027 hash_string 8.4 µs 8.3 µs 0.2 µs 8.1 µs 9.0 µs 119,048 list comprehension 412.7 µs 411.2 µs 3.8 µs 406.1 µs 422.5 µs 2,423 ────────────────────────────────────────────────────────────────────────

Command Line Interface

pybench includes a CLI that discovers and runs benchmark files automatically. Files matching bench_*.py or *_bench.py are discovered.

$ pybench run

Discover and run benchmarks in the current directory or a specific path.

# Run all benchmarks in current directory
$ pybench run

# Run benchmarks in a specific directory
$ pybench run benchmarks/

# Run a single benchmark file
$ pybench run bench_sorting.py

# Output JSON instead of table
$ pybench run --json

# Save results to a file for later comparison
$ pybench run --save results.json

# Control warmup and iteration count
$ pybench run --warmup 10 --iterations 500

Benchmark files use the module-level @pybench.benchmark decorator. The CLI imports each file and collects decorated functions.

bench_sorting.py

import pybench

data = list(range(1000, 0, -1))

@pybench.benchmark
def builtin_sort():
    sorted(data)

@pybench.benchmark
def reverse_then_sort():
    d = data[:]
    d.reverse()
    d.sort()

$ pybench compare

Compare two saved benchmark runs to detect performance regressions or improvements.

# Save a baseline
$ pybench run --save baseline.json

# ... make changes ...

# Save current run and compare
$ pybench run --save current.json
$ pybench compare baseline.json current.json

pybench comparison ──────────────────────────────────────────────────────────────── Name Baseline Current Change ──────────────────────────────────────────────────────────────── builtin_sort 45.2 µs 44.8 µs -0.9% (faster) reverse_then_sort 38.1 µs 42.3 µs +11.0% (slower) ────────────────────────────────────────────────────────────────

API Reference

Complete reference for all public classes, methods, and functions.

class

Bench

Bench(warmup=5, iterations=None, target_time_ns=1_000_000_000)

Central orchestrator for collecting and running benchmarks.

Parameter	Type	Default	Description
warmup	int	5	Warmup iterations before measurement
iterations	int \| None	None	Fixed iteration count, or None for auto-calibration
target_time_ns	int	1B	Target total time for auto-calibration (~1 second)

method

bench.benchmark(fn=None, *, warmup=None, iterations=None)

Decorator to register a function as a benchmark. Can be used with or without parentheses. Per-benchmark warmup and iterations override the Bench-level defaults.

method

bench.measure(name: str)

Context manager that times the enclosed block and records a single-sample result. Returns immediately — timing happens on context exit.

method

bench.run() → list[BenchmarkResult]

Execute all registered benchmarks and return a list of results. Context manager results are included automatically.

method

bench.report(json_output: bool = False)

Print results to stdout as a formatted table (default) or JSON. Automatically calls run() if benchmarks haven't been executed yet.

method

bench.to_json() → str

Return results as a JSON string with platform metadata (Python version, OS, timestamp). Automatically calls run() if benchmarks haven't been executed yet.

dataclass

BenchmarkResult

Immutable result object containing timing data and computed statistics. Created by Bench.run() — you typically don't construct these directly.

Field	Type	Description
name	str	Benchmark name
times_ns	list[int]	Raw timing data in nanoseconds
iterations	int	Number of measured iterations
mean_ns	float	Arithmetic mean of timings
median_ns	float	Median timing
stddev_ns	float	Standard deviation
min_ns	int	Fastest timing
max_ns	int	Slowest timing
ops_per_sec	float	Operations per second (1B / mean_ns)

function

benchmark(fn=None, *, warmup=None, iterations=None)

Module-level decorator for use with the CLI. Functions decorated with @pybench.benchmark are discovered and run by pybench run.

import pybench

@pybench.benchmark
def my_benchmark():
    # This function will be discovered by `pybench run`
    sorted(range(1000, 0, -1))

JSON Output Schema

bench.to_json() and pybench run --json produce this structure:

{
  "metadata": {
    "python_version": "3.12.1",
    "platform": "Linux",
    "timestamp": "2026-02-13T12:00:00+00:00"
  },
  "results": [
    {
      "name": "sort_list",
      "iterations": 1000,
      "mean_ns": 142300.0,
      "median_ns": 141800.0,
      "stddev_ns": 2100.0,
      "min_ns": 138900,
      "max_ns": 149200,
      "ops_per_sec": 7027.4
    }
  ]
}