nodetool gettraceprobability¶

Displays the current probability of tracing CQL requests on the node.

Synopsis¶

nodetool [connection_options] gettraceprobability

See connection options for connection options.

Description¶

nodetool gettraceprobability shows the current probability (0.0 to 1.0) that any given CQL request will be traced. This setting controls Cassandra's probabilistic tracing feature, which samples a percentage of requests for detailed performance analysis.

What is Request Tracing?¶

Request tracing in Cassandra records detailed timing information about how a CQL query is processed across the cluster. When a request is traced, Cassandra captures:

Coordinator activity - Time spent parsing, planning, and coordinating the query
Replica communication - Time to send requests to and receive responses from replica nodes
Per-replica processing - What each replica did (memtable reads, SSTable reads, bloom filter checks)
Latency breakdown - Microsecond-level timing for each operation phase

Trace data is written to the system_traces keyspace, which contains two tables:

Table	Contents
`system_traces.sessions`	One row per traced request with summary information
`system_traces.events`	Detailed events for each traced request

Why Probabilistic Tracing?¶

Tracing has significant overhead—each traced request generates multiple writes to system_traces. Enabling tracing for all requests (probability 1.0) would:

Increase write amplification substantially
Consume significant disk space
Impact cluster performance

Probabilistic tracing allows sampling a small percentage of requests to gather representative performance data without overwhelming the cluster. For example, with 0.001 (0.1%) probability on a cluster handling 100,000 requests/second, approximately 100 requests/second would be traced—enough for analysis without significant overhead.

Examples¶

Basic Usage¶

nodetool gettraceprobability

Sample Output¶

Current trace probability: 0.0

A value of 0.0 means no requests are being traced (the default).

Understanding Probability Values¶

Value	Percentage	Meaning	Use Case
0.0	0%	No tracing (default)	Normal production operation
0.0001	0.01%	1 in 10,000 requests	High-traffic production sampling
0.001	0.1%	1 in 1,000 requests	Production performance monitoring
0.01	1%	1 in 100 requests	Active troubleshooting
0.1	10%	1 in 10 requests	Development/testing
1.0	100%	All requests	Brief debugging only

Performance Impact

Values above 0.01 (1%) can noticeably impact performance on busy clusters. Values of 0.1 or higher should only be used briefly during active debugging sessions or in non-production environments.

Viewing Trace Data¶

Once tracing is enabled and requests are sampled, trace data can be queried from system_traces:

View Recent Trace Sessions¶

SELECT * FROM system_traces.sessions
WHERE started_at > toTimestamp(now()) - 1h
LIMIT 10;

View Events for a Specific Trace¶

-- First, get a session_id from sessions table
SELECT session_id, coordinator, request, started_at, duration
FROM system_traces.sessions LIMIT 5;

-- Then query events for that session
SELECT activity, source, source_elapsed, thread
FROM system_traces.events
WHERE session_id = <session_id_from_above>;

Example Trace Output¶

 activity                                          | source        | source_elapsed
---------------------------------------------------+---------------+----------------
 Parsing SELECT * FROM users WHERE id = ?          | 192.168.1.101 |             52
 Preparing statement                               | 192.168.1.101 |            118
 Determining replicas for query                    | 192.168.1.101 |            156
 Sending READ message to /192.168.1.102           | 192.168.1.101 |            203
 READ message received from /192.168.1.101        | 192.168.1.102 |             45
 Executing single-partition query on users        | 192.168.1.102 |            112
 Acquiring sstable references                      | 192.168.1.102 |            158
 Bloom filter allows skipping sstable 1           | 192.168.1.102 |            201
 Partition index with 1 entries found             | 192.168.1.102 |            289
 Seeking to partition indexed section             | 192.168.1.102 |            334
 Merging memtable contents                        | 192.168.1.102 |            412
 Read 1 live rows and 0 tombstone cells           | 192.168.1.102 |            498
 Enqueuing response to /192.168.1.101             | 192.168.1.102 |            534
 Processing response from /192.168.1.102          | 192.168.1.101 |           2341
 Request complete                                  | 192.168.1.101 |           2456

Use Cases¶

Verify Tracing is Disabled¶

Before performance testing, ensure tracing isn't adding overhead:

nodetool gettraceprobability
# Should return 0.0

Check if Debugging Session is Active¶

Verify if someone enabled tracing for troubleshooting:

nodetool gettraceprobability
# If > 0.0, tracing is active

Audit Cluster Configuration¶

Include in cluster health checks:

#!/bin/bash
# Check trace probability on all nodes

for node in $(nodetool status | grep "^UN" | awk '{print $2}'); do
    prob=$(ssh "$node" 'nodetool gettraceprobability 2>/dev/null | grep -oE "[0-9]+\.[0-9]+"')
    if [ "$prob" != "0.0" ]; then
        echo "WARNING: $node has trace probability $prob"
    fi
done

Trace Probability and Performance¶

The relationship between trace probability and overhead:

Probability	Overhead	`system_traces` Growth	Recommended Duration
0.0	None	None	Indefinite (default)
0.0001-0.001	Minimal	Slow	Days to weeks
0.001-0.01	Low	Moderate	Hours to days
0.01-0.1	Moderate	Fast	Minutes to hours
0.1-1.0	High	Very fast	Minutes only

Cleaning Up Trace Data

Trace data accumulates in system_traces with a default TTL of 24 hours. For extended tracing sessions, consider:

Lowering the TTL: ALTER TABLE system_traces.sessions WITH default_time_to_live = 3600;
Manually truncating: TRUNCATE system_traces.sessions; TRUNCATE system_traces.events;

Comparing with CQL TRACING¶

Cassandra offers two tracing mechanisms:

Feature	Probabilistic Tracing	CQL `TRACING ON`
Scope	All requests cluster-wide	Single cqlsh session
Control	`nodetool settraceprobability`	`TRACING ON/OFF` in cqlsh
Sampling	Percentage-based	All queries in session
Use case	Production monitoring	Interactive debugging
Persistence	`system_traces` tables	`system_traces` tables

-- CQL session-level tracing (alternative to probabilistic)
TRACING ON;
SELECT * FROM my_keyspace.my_table WHERE id = 123;
TRACING OFF;

Command	Relationship
settraceprobability	Set the trace probability
proxyhistograms	View latency histograms
tablehistograms	View per-table latency histograms