Skip to main content
A trace is one top-level generateText / streamText call. The Traces view lists them; opening one shows the full execution as a waterfall you can inspect span by span and replay in real time. See the data model for how traces and spans are defined.

Traces list

A paginated table (newest first) over the selected date range. Columns: trace id, name (traceName, falling back to agentName, else Untitled trace), span count, tokens, duration, cost, and when it ran. Errored traces carry a red badge. Click a row to open the trace.

Waterfall

Each span is a row, indented by its parent (parentSpanId) so nested tool calls and steps sit under their parent. Span types are colour-coded:
TypeColourMeaning
agentamberthe root span — the whole call
llmvioletone model step
toolblueone tool execution
othergreyanything else
The bar’s offset and width are proportional to when the span started and how long it ran, so you can see serial vs. overlapping work at a glance.

Span detail

Click any span to open the inspector panel beside the waterfall. For a single span it shows, as applicable:
  • Timing — start time, duration, and TTFT (time to first token). On reasoning models the TTFT splits into thinking time plus the time to first visible text.
  • Model — provider and model id.
  • Tokens & cost — input/output tokens, computed cost, the pricing source that produced it (resolved price vs. a custom rule), and a per-dimension cost breakdown (prompt, completion, cache read/write, reasoning, image, web search, request) when more than one dimension applies.
  • Model call — on v7 spans, the model-only wall-clock with the remaining tool time called out (durationMs − modelCallMs).
  • Throughput — for streaming LLM spans, a tokens/second headline.
  • Provider signals — when captured: normalized rate-limit headroom, the model build fingerprint, safety ratings, and grounding sources.
  • Tools available — the catalog of tools the model was offered for the call.
  • Payloads — the captured input and output, pretty-printed (JSON is syntax-highlighted) in a scrollable block (subject to recordInputs/recordOutputs).
  • Metadata & errors — any span metadata and the error message, if the span failed.
Selecting the whole trace (rather than a single span) swaps the inspector for a run-level rollup: duration, cost, tokens, span and LLM-call counts, and errors.

Scores

If a trace or span has been scored by an eval, its scores appear in the inspector under Evals — one row per result, tinted green for pass and red for fail (or showing the numeric score), with the judge’s reason shown inline and a link through to that run on the eval page.

Replay

The waterfall doubles as a replay surface: press play to watch the trace reconstruct on its real timeline, switch between 1× / 2× / 4× speed, and drag the ruler to seek. A faint throughput backdrop and a peak … tok/s readout ride behind the bars, and the TTFT moment is marked on each LLM bar. Replay is powered by the intra-stream token samples the SDK records — no extra instrumentation needed. Opening a trace with ?replay=1 auto-plays it.