Scalene Development Guide

Project Overview

Scalene is a high-performance CPU, GPU, and memory profiler for Python with AI-powered optimization proposals. It runs significantly faster than other Python profilers while providing detailed performance information. See the paper docs/osdi23-berger.pdf for technical details on Scalene's design.

Key features:

CPU, GPU (NVIDIA/Apple), and memory profiling
AI-powered optimization suggestions (OpenAI, Anthropic, Azure, Amazon Bedrock, Gemini, Ollama)
Web-based GUI and CLI interfaces
Jupyter notebook support via magic commands (%scrun, %%scalene)
Line-by-line profiling with low overhead
Separates Python time from native/C time

Platform support: Linux, macOS, WSL 2 (full support); Windows (partial support)

Build & Test Commands

# Install in development mode
pip install -e .

# Run all tests
python3 -m pytest tests/

# Run tests for a specific Python version
python3.X -m pytest tests/

# Run linters
mypy scalene
ruff check scalene

# Run a single test file
python3 -m pytest tests/test_coverup_83.py -v

Project Structure

Core Profiler Components (`scalene/`)

scalene_profiler.py - Main profiler class (Scalene). Entry point for profiling. Uses signal-based sampling for CPU profiling. Coordinates all profiling subsystems.
scalene_statistics.py - ScaleneStatistics class. Collects and aggregates profiling data. Key types: ProfilingSample, MemcpyProfilingSample. Uses RunningStats for statistical aggregation.
scalene_output.py - Profile output formatting for CLI/HTML
scalene_json.py - ScaleneJSON class for JSON output format
scalene_analysis.py - Profile analysis logic

Entry Points

__main__.py - Entry point for python -m scalene
profile.py - Entry point for --on/--off control of background profiling

Configuration & Arguments

scalene_config.py - Version info (scalene_version, scalene_date) and constants:
- SCALENE_PORT = 11235 - Default port for web UI
- NEWLINE_TRIGGER_LENGTH - Must match src/include/sampleheap.hpp
scalene_arguments.py - ScaleneArguments class (extends argparse.Namespace) with all profiler options and their defaults defined in ScaleneArgumentsDict
scalene_parseargs.py - ScaleneParseArgs.parse_args() builds the argument parser. RichArgParser provides colored help output (uses Rich on Python < 3.14, native argparse colors on 3.14+)

Signal Handling

scalene_signals.py - Signal definitions for CPU sampling
scalene_signal_manager.py - Manages signal handlers
scalene_sigqueue.py - Signal queue management
scalene_client_timer.py - Timer for periodic profiling

GPU Support

scalene_nvidia_gpu.py - NVIDIA GPU profiling via pynvml
scalene_apple_gpu.py - Apple GPU profiling (Metal)
scalene_accelerator.py - Generic accelerator interface
scalene_neuron.py - AWS Neuron support

Memory Profiling

scalene_memory_profiler.py - Memory profiling logic
scalene_leak_analysis.py - Memory leak detection (experimental, --memory-leak-detector)
scalene_mapfile.py - ScaleneMapFile for memory-mapped communication with native extension
scalene_preload.py - Sets up LD_PRELOAD/DYLD_INSERT_LIBRARIES for native memory tracking

Jupyter Integration

scalene_magics.py - Jupyter magic commands (%scrun for line mode, %%scalene for cell mode)
scalene_jupyter.py - Jupyter notebook support utilities

Replacement Modules (`replacement_*.py`)

These modules monkey-patch standard library functions to capture profiling data during blocking operations:

replacement_fork.py - Tracks os.fork()
replacement_exit.py - Tracks sys.exit()
replacement_lock.py, replacement_mp_lock.py, replacement_sem_lock.py - Lock acquisition timing
replacement_thread_join.py, replacement_pjoin.py - Thread/process join timing
replacement_signal_fns.py - Signal function replacements
replacement_poll_selector.py - I/O polling timing
replacement_get_context.py - Multiprocessing context

Utilities

runningstats.py - RunningStats class for online statistical calculations (mean, variance)
scalene_funcutils.py - Function utilities
scalene_utility.py - General utilities
sparkline.py - Sparkline generation for memory visualization
syntaxline.py - Syntax-highlighted source code lines
adaptive.py - Adaptive sampling logic
time_info.py - Time measurement utilities
sorted_reservoir.py - Reservoir sampling for bounded-size sample collection

GUI (`scalene/scalene-gui/`)

Web-based GUI built with TypeScript, bundled with esbuild.

Core Files:

index.html.template - Jinja2 template for main GUI page (rendered by scalene_utility.py)
scalene-gui.ts - Main TypeScript entry point, UI event handlers, initialization
scalene-gui-bundle.js - Bundled JavaScript output (generated, do not edit directly)

AI Provider Modules:

openai.ts - OpenAI API integration (sendPromptToOpenAI, fetchOpenAIModels)
anthropic.ts - Anthropic Claude API integration
gemini.ts - Google Gemini API integration (sendPromptToGemini, fetchGeminiModels)
optimizations.ts - Provider dispatch logic, prompt generation
persistence.ts - localStorage persistence with environment variable fallbacks

Support Files:

launchbrowser.py - Opens browser to GUI (default port 11235)
find_browser.py - Cross-platform browser detection

Vendored Assets (for offline support):

jquery-3.6.0.slim.min.js - jQuery (vendored locally, not loaded from CDN)
bootstrap.min.css - Bootstrap 5.1.3 CSS
bootstrap.bundle.min.js - Bootstrap 5.1.3 JS with Popper
prism.css - Syntax highlighting styles
favicon.ico - Scalene favicon
scalene-image.png - Scalene logo

These assets are copied to a temp directory when serving via HTTP, enabling the GUI to work in air-gapped/offline environments.

Building the GUI:

cd scalene/scalene-gui
npx esbuild scalene-gui.ts --bundle --outfile=scalene-gui-bundle.js --format=iife --global-name=ScaleneGUI

Native Extensions (`src/`)

C++ code for low-overhead memory allocation tracking:

Headers (src/include/):

sampleheap.hpp - Sampling heap allocator. Key constant NEWLINE must match Python config.
memcpysampler.hpp - Intercepts memcpy to track copy volume
pywhere.hpp - Tracks Python file/line info for allocations
samplefile.hpp - File-based communication with Python
sampler.hpp, poissonsampler.hpp, thresholdsampler.hpp - Sampling strategies
scaleneheader.hpp - Common header definitions

Sources (src/source/):

libscalene.cpp - Main native library (loaded via LD_PRELOAD)
pywhere.cpp - Python location tracking implementation
get_line_atomic.cpp - Atomic line number access
traceconfig.cpp - Trace configuration

Vendor Libraries (`vendor/`)

Heap-Layers/ - Memory allocator infrastructure (by Emery Berger)
printf/ - Async-signal-safe printf implementation

Key Patterns

Python Version Compatibility

The codebase supports Python 3.8-3.14. Version-specific code uses:

if sys.version_info >= (3, 14):
    # Python 3.14+ specific code
else:
    # Older Python versions

Type Annotation Compatibility (Python 3.8/3.9):

Do NOT use X | Y union syntax in runtime-evaluated annotations (PEP 604 requires Python 3.10+). Use Optional[X] or Union[X, Y] from typing instead.
Do NOT use list[X], dict[K, V], tuple[X, ...] in runtime-evaluated annotations (PEP 585 lowercase generics require Python 3.9+). Use List, Dict, Tuple from typing for 3.8 support.
Adding from __future__ import annotations makes all annotations strings (not evaluated at runtime), which allows modern syntax on older Python. However, this can break code that inspects annotations at runtime (e.g., dataclasses, pydantic).
The safest approach for this codebase: use typing.Optional, typing.Union, typing.List, typing.Tuple, typing.Dict in all annotation positions that are evaluated at runtime (function signatures, variable annotations outside if TYPE_CHECKING blocks).

Python 3.13 Changes (dis module):

dis.Instruction.starts_line changed from int | None (line number) to bool
New dis.Instruction.line_number attribute (int | None) added for the actual line number
On Python < 3.13, starts_line is only set on the first instruction of each source line; use a line-tracking loop to propagate line numbers to subsequent instructions

Bytecode/Opcode Compatibility (dis module):

Never match specific opcode names (e.g., JUMP_BACKWARD, JUMP_ABSOLUTE, POP_JUMP_IF_TRUE). Opcode names change across Python versions — for example, Python 3.10 while loops use POP_JUMP_IF_TRUE for backward jumps, Python 3.11+ uses JUMP_BACKWARD, and JUMP_ABSOLUTE was removed in 3.12.
Always use abstract dis module categories when possible: dis.hasjabs (absolute jump opcodes), dis.hasjrel (relative jump opcodes), dis.hasconst, dis.hasname, etc. These are maintained by CPython and work across all versions.
For call detection, matching opname.startswith("CALL") is acceptable since that prefix has been stable, but prefer opcode integer sets over name strings for hot paths.
When checking jump direction (forward vs backward), use instr.argval (which dis resolves to an absolute offset) and compare against instr.offset, rather than relying on opcode names to imply direction.

Python 3.14 Changes:

argparse now has built-in colored help output (color=True parameter)
RichArgParser uses Rich for colors on Python < 3.14, native argparse colors on 3.14+

Argument Parsing (`scalene_parseargs.py`)

class RichArgParser(argparse.ArgumentParser):
    """ArgumentParser that uses Rich for colored output on Python < 3.14."""

    def __init__(self, *args, **kwargs):
        if sys.version_info < (3, 14):
            from rich.console import Console
            self._console = Console()
        else:
            self._console = None
        super().__init__(*args, **kwargs)

The _colorize_help_for_rich() function applies Python 3.14-style colors using Rich markup:

usage: and options: → bold blue
Program name → bold magenta
Long options (--foo) → bold cyan
Short options (-h) → bold green
Metavars (FOO) → bold yellow

GUI Patterns

Preventing Browser Password Prompts: Use autocomplete="one-time-code" on password/API key inputs to prevent browsers from offering to save them:

<input type="password" id="api-key" autocomplete="one-time-code">

Show/Hide Password Toggle:

function togglePassword(inputId: string, button: HTMLButtonElement): void {
  const input = document.getElementById(inputId) as HTMLInputElement;
  if (input.type === "password") {
    input.type = "text";
    button.textContent = "Hide";
  } else {
    input.type = "password";
    button.textContent = "Show";
  }
}

Provider Field Visibility: Use CSS classes to show/hide provider-specific fields:

function toggleServiceFields(): void {
  const service = (document.getElementById("service") as HTMLSelectElement).value;
  // Hide all provider sections
  document.querySelectorAll(".provider-section").forEach((el) => {
    (el as HTMLElement).style.display = "none";
  });
  // Show selected provider section
  const section = document.querySelector(`.${service}-fields`);
  if (section) (section as HTMLElement).style.display = "block";
}

Persistent Form Elements: Add class persistent to inputs that should be saved/restored from localStorage:

<input type="text" id="api-key" class="persistent">

The persistence.ts module handles save/restore automatically.

Standalone HTML Generation: The generate_html() function in scalene_utility.py supports a standalone parameter:

When standalone=False (default): Assets are referenced as local files (e.g., <script src="jquery-3.6.0.slim.min.js">)
When standalone=True: All assets are embedded inline (JS/CSS as text, images as base64)

The Jinja2 template uses conditionals:

{% if standalone %}
<script>{{ jquery_js }}</script>
<style>{{ bootstrap_css }}</style>
{% else %}
<script src="jquery-3.6.0.slim.min.js"></script>
<link href="bootstrap.min.css" rel="stylesheet">
{% endif %}

Module Imports

When importing submodules, be explicit:

# Correct - mypy can verify this
import importlib.util
importlib.util.find_spec(mod_name)

# Wrong - mypy error: Module has no attribute "util"
import importlib
importlib.util.find_spec(mod_name)

Testing

Test Files (`tests/`)

test_coverup_*.py - Auto-generated coverage tests
test_runningstats.py - Statistics tests (requires hypothesis)
test_scalene_json.py - JSON output tests (requires hypothesis)
test_nested_package_relative_import.py - Import handling tests

Test Dependencies

pip install pytest pytest-asyncio hypothesis

Running Tests Across Python Versions

for v in 3.9 3.10 3.11 3.12 3.13 3.14; do
    python$v -m pytest tests/test_coverup_83.py -v
done

Flaky Smoketests

The smoketests in test/ can be flaky due to timing/sampling issues inherent to profiling:

"No non-zero lines in X" - The profiler didn't collect enough samples. This happens when the test runs too quickly or signal delivery timing varies.
"Expected function 'X' not returned" - A function wasn't sampled. Common with short-running functions.

These failures are usually timing-related and pass on re-run. They're more common on CI due to variable machine load.

Port Binding in Tests

When testing port availability, never use hardcoded ports - they may already be in use on CI runners:

# Bad - port 49200 might be in use
port = 49200
sock.bind(("", port))

# Good - find an available port first
port = find_available_port(49200, 49300)
if port is None:
    return  # Skip test if no ports available
sock.bind(("", port))

CI/CD (`.github/workflows/`)

run-linters.yml - Runs mypy and ruff on Python 3.9-3.14
tests.yml - Runs pytest on Python 3.9-3.14
build-and-upload.yml - Build and publish to PyPI

Common Tasks

Adding a New CLI Option

Add default value in scalene_arguments.py:

class ScaleneArgumentsDict(TypedDict, total=False):
    my_option: bool

Add argument in scalene_parseargs.py:

parser.add_argument(
    "--my-option",
    dest="my_option",
    action="store_true",
    default=defaults.my_option,
    help="Description of option",
)

Adding a New AI Provider

Create provider module (scalene/scalene-gui/newprovider.ts):

export async function sendPromptToNewProvider(
  prompt: string,
  apiKey: string
): Promise<string> {
  // API call implementation
}

export async function fetchNewProviderModels(apiKey: string): Promise<string[]> {
  // Optional: fetch available models from API
}

Update optimizations.ts:
- Import the new module
- Add case in sendPromptToService() switch statement
Update index.html.template:
- Add option to #service select dropdown
- Add provider section with API key input, model selector, etc.
- Add CSS for .newprovider-fields visibility
Update scalene-gui.ts:
- Add provider to toggleServiceFields() function
- Add refresh handler if dynamic model fetching is supported
- Update getDefaultProvider() if env var support is needed
Update persistence.ts (for env var support):
- Add mapping in envKeyMap for new fields
Update scalene_utility.py:
- Read environment variable in api_keys dict
- Pass to template rendering

Rebuild the bundle:

cd scalene/scalene-gui
npx esbuild scalene-gui.ts --bundle --outfile=scalene-gui-bundle.js --format=iife --global-name=ScaleneGUI

Environment Variable API Keys

The GUI supports prepopulating API keys from environment variables:

Element ID	Environment Variable	Provider
`api-key`	`OPENAI_API_KEY`	OpenAI
`anthropic-api-key`	`ANTHROPIC_API_KEY`	Anthropic
`gemini-api-key`	`GEMINI_API_KEY` or `GOOGLE_API_KEY`	Gemini
`azure-api-key`	`AZURE_OPENAI_API_KEY`	Azure OpenAI
`azure-api-url`	`AZURE_OPENAI_ENDPOINT`	Azure OpenAI
`aws-access-key`	`AWS_ACCESS_KEY_ID`	Amazon Bedrock
`aws-secret-key`	`AWS_SECRET_ACCESS_KEY`	Amazon Bedrock
`aws-region`	`AWS_DEFAULT_REGION` or `AWS_REGION`	Amazon Bedrock

Flow:

scalene_utility.py reads env vars and passes to Jinja2 template
Template injects envApiKeys JavaScript object into page
persistence.ts uses env vars as fallbacks when localStorage is empty

Updating Version

Edit scalene/scalene_config.py:

scalene_version = "X.Y.Z"
scalene_date = "YYYY.MM.DD"

Dependencies

Key runtime dependencies:

rich - Terminal formatting and colors
cloudpickle - Serialization
pynvml - NVIDIA GPU support (optional)

See requirements.txt for full list.

CLI Structure

Scalene uses a verb-based CLI with two main subcommands:

# Profile a program (saves to scalene-profile.json by default)
scalene run [options] yourprogram.py

# View an existing profile
scalene view [options] [profile.json]

Run Subcommand Options

scalene run prog.py                      # profile, save to scalene-profile.json
scalene run -o my.json prog.py           # save to custom file
scalene run --cpu-only prog.py           # profile CPU only (faster)
scalene run -c config.yaml prog.py       # load options from config file
scalene run prog.py --- --arg            # pass args to program

View Subcommand Options

scalene view                             # open in browser
scalene view --cli                       # view in terminal
scalene view --html                      # save to scalene-profile.html
scalene view --standalone                # save as self-contained HTML (all assets embedded)
scalene view myprofile.json              # open specific profile

Profile Completion Message

After profiling completes, Scalene prints instructions for viewing the profile:

Scalene: profile saved to scalene-profile.json
  To view in browser:  scalene view
  To view in terminal: scalene view --cli

The filename is only included in the command if a non-default output file was used.

YAML Configuration

Create a scalene.yaml file with options:

outfile: my-profile.json
cpu-only: true
profile-only: "mypackage,utils"
cpu-percent-threshold: 5

Load with: scalene run -c scalene.yaml prog.py

Advanced Options

Use scalene run --help-advanced to see all options including:

--profile-all - profile all code, not just the target program
--profile-only PATH - only profile files containing these strings
--profile-exclude PATH - exclude files containing these strings
--profile-system-libraries - profile Python stdlib and installed packages (skipped by default)
--gpu - profile GPU time and memory
--memory - profile memory usage
--stacks - collect stack traces
--profile-interval N - output profiles every N seconds

Smoke Tests

Smoke tests in test/ use the new CLI syntax:

# test/smoketest.py
cmd = [sys.executable, "-m", "scalene", "run", "-o", str(outfile), *rest, fname]

GitHub Workflows

Workflows in .github/workflows/ use the new CLI:

# Profile with interval, then view
- run: python -m scalene run --profile-interval=2 test/testme.py && python -m scalene view --cli

# Profile with module invocation
- run: python -m scalene run --- -m import_stress_test && python -m scalene view --cli

Signal Handling

Scalene uses several Unix signals for profiling. The signal assignments are in scalene_signals.py:

Signal	Purpose	Platform
`SIGVTALRM`	CPU profiling timer (default)	Unix
`SIGALRM`	CPU profiling timer (real time mode)	Unix
`SIGILL`	Start profiling (`--on`)	Unix
`SIGBUS`	Stop profiling (`--off`)	Unix
`SIGPROF`	memcpy tracking	Unix
`SIGXCPU`	malloc tracking	Unix
`SIGXFSZ`	free tracking	Unix

Signal Conflicts with Libraries

Libraries like PyTorch Lightning may also use these signals. The replacement_signal_fns.py module handles conflicts:

On Linux: Uses real-time signals (SIGRTMIN+1 to SIGRTMIN+5) for redirection. When user code sets a handler for a Scalene signal, their handler is redirected to a real-time signal. Calls to raise_signal() and kill() are also redirected transparently.

On macOS/other platforms: Uses handler chaining. Both Scalene's handler and the user's handler are called when the signal fires.

# Platform-specific signal handling
_use_rt_signals = sys.platform == "linux" and hasattr(signal, "SIGRTMIN")

if _use_rt_signals:
    # Linux: redirect to real-time signals
    rt_base = signal.SIGRTMIN + 1
    _signal_redirects[signal.SIGILL] = rt_base
else:
    # macOS: chain handlers
    def chained_handler(sig, frame):
        scalene_handler(sig, frame)
        user_handler(sig, frame)

Frame Line Number Can Be None (Python 3.11+)

In Python 3.11+, frame.f_lineno can be None in edge cases (e.g., during multiprocessing cleanup). Always use a fallback:

lineno = frame.f_lineno if frame.f_lineno is not None else frame.f_code.co_firstlineno

Native Extension Build Issues

C++ Standard Library Conflicts with vendor/printf

The vendor/printf/printf.h header defines macros that conflict with C++ standard library:

#define vsnprintf vsnprintf_
#define snprintf  snprintf_

This breaks std::vsnprintf in <string> and other headers. Fix: Include C++ standard headers BEFORE vendor headers in src/source/libscalene.cpp:

// Include C++ standard headers FIRST
#include <cstddef>
#include <string>

// Then vendor headers that define conflicting macros
#include <heaplayers.h>  // Eventually includes printf.h

Profiling Guide

See Scalene-Agents.md for detailed information about interpreting Scalene's profiling output, including Python vs C time, memory metrics, and optimization strategies.

Debugging Guide

See Scalene-Debugging.md for signal handler debugging, async profiling debugging, the profile output pipeline (three separate renderers!), and unbounded growth prevention patterns.

GUI Development Guide

See Scalene-GUI.md for adding new columns, Vega-Lite chart types, pie chart best practices (two-wedge rendering, rotating pies), and the chart rendering flow.

Uh oh!

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

Scalene Development Guide

Project Overview

Build & Test Commands

Project Structure

Core Profiler Components (scalene/)

Entry Points

Configuration & Arguments

Signal Handling

GPU Support

Memory Profiling

Jupyter Integration

Replacement Modules (replacement_*.py)

Utilities

GUI (scalene/scalene-gui/)

Native Extensions (src/)

Vendor Libraries (vendor/)

Key Patterns

Python Version Compatibility

Argument Parsing (scalene_parseargs.py)

GUI Patterns

Module Imports

Testing

Test Files (tests/)

Test Dependencies

Running Tests Across Python Versions

Flaky Smoketests

Port Binding in Tests

CI/CD (.github/workflows/)

Common Tasks

Adding a New CLI Option

Adding a New AI Provider

Environment Variable API Keys

Updating Version

Dependencies

CLI Structure

Run Subcommand Options

View Subcommand Options

Profile Completion Message

YAML Configuration

Advanced Options

Smoke Tests

GitHub Workflows

Signal Handling

Signal Conflicts with Libraries

Frame Line Number Can Be None (Python 3.11+)

Native Extension Build Issues

C++ Standard Library Conflicts with vendor/printf

Profiling Guide

Debugging Guide

GUI Development Guide

Core Profiler Components (`scalene/`)

Replacement Modules (`replacement_*.py`)

GUI (`scalene/scalene-gui/`)

Native Extensions (`src/`)

Vendor Libraries (`vendor/`)

Argument Parsing (`scalene_parseargs.py`)

Test Files (`tests/`)

CI/CD (`.github/workflows/`)