generateText / streamText call. The Traces
view lists them; opening one shows the full execution as a waterfall you can
inspect span by span and replay in real time.
See the data model for how traces and spans are defined.
Traces list
A paginated table (newest first) over the selected date range. Columns: trace id, name (traceName, falling back to agentName, else Untitled trace),
span count, tokens, duration, cost, and when it ran. Errored traces carry a red
badge. Click a row to open the trace.
Waterfall
Each span is a row, indented by its parent (parentSpanId) so nested tool calls
and steps sit under their parent. Span types are colour-coded:
| Type | Colour | Meaning |
|---|---|---|
agent | amber | the root span — the whole call |
llm | violet | one model step |
tool | blue | one tool execution |
other | grey | anything else |
Span detail
Click any span to open the inspector panel beside the waterfall. For a single span it shows, as applicable:- Timing — start time, duration, and TTFT (time to first token). On reasoning models the TTFT splits into thinking time plus the time to first visible text.
- Model — provider and model id.
- Tokens & cost — input/output tokens, computed cost, the pricing source that produced it (resolved price vs. a custom rule), and a per-dimension cost breakdown (prompt, completion, cache read/write, reasoning, image, web search, request) when more than one dimension applies.
- Model call — on v7 spans, the model-only wall-clock with the remaining
tool time called out (
durationMs − modelCallMs). - Throughput — for streaming LLM spans, a tokens/second headline.
- Provider signals — when captured: normalized rate-limit headroom, the model build fingerprint, safety ratings, and grounding sources.
- Tools available — the catalog of tools the model was offered for the call.
- Payloads — the captured
inputandoutput, pretty-printed (JSON is syntax-highlighted) in a scrollable block (subject torecordInputs/recordOutputs). - Metadata & errors — any span metadata and the error message, if the span failed.
Scores
If a trace or span has been scored by an eval, its scores appear in the inspector under Evals — one row per result, tinted green for pass and red for fail (or showing the numeric score), with the judge’s reason shown inline and a link through to that run on the eval page.Replay
The waterfall doubles as a replay surface: press play to watch the trace reconstruct on its real timeline, switch between 1× / 2× / 4× speed, and drag the ruler to seek. A faint throughput backdrop and apeak … tok/s readout ride
behind the bars, and the TTFT moment is marked on each LLM bar. Replay is powered
by the intra-stream token samples the SDK records — no extra instrumentation
needed. Opening a trace with ?replay=1 auto-plays it.
