nodetool toppartitions¶
Samples and displays the most active partitions.
Deprecated in Cassandra 4.0+
toppartitions is deprecated since Cassandra 4.0 and is now an alias for profileload. Consider using nodetool profileload directly for new scripts and automation.
Synopsis¶
nodetool [connection_options] toppartitions [options] [keyspace] [table] [duration]
Description¶
nodetool toppartitions samples partition access over a specified duration and reports the most frequently accessed partitions. This helps identify hot partitions that may be causing performance issues.
Since Cassandra 4.0, this command is an alias for profileload and shares its functionality.
Arguments¶
All arguments are optional. When omitted, defaults are used.
| Argument | Description | Default |
|---|---|---|
keyspace |
The keyspace to sample | All keyspaces |
table |
The table to sample | All tables |
duration |
Sampling duration in milliseconds | 10000 (10 seconds) |
Options¶
| Option | Description |
|---|---|
-s, --capacity <count> |
Capacity of the sampler reservoir (default: 256) |
-k, --top-count <count> |
Number of top partitions to return (default: 10) |
-a, --samplers <samplers> |
Comma-separated sampler types: READS, WRITES, CAS_CONTENTIONS, LOCAL_READ_TIME, WRITE_SIZE (default: all) |
-i, --interval <ms> |
Sampling interval in milliseconds |
-t, --stop |
Stop ongoing sampling |
-l, --list |
List active sampling sessions |
Output Format¶
WRITES Sampler:
Cardinality: ~1000
Top 10 partitions:
Partition Count +/-
user_12345 150 10
user_67890 120 8
user_11111 95 7
...
READS Sampler:
Cardinality: ~800
Top 10 partitions:
Partition Count +/-
product_abc 200 15
product_xyz 180 12
...
Examples¶
Sample for 10 Seconds (10000 ms)¶
nodetool toppartitions my_keyspace my_table 10000
Sample with More Results¶
nodetool toppartitions -k 20 my_keyspace my_table 30000
Sample Reads Only¶
nodetool toppartitions -a READS my_keyspace my_table 10000
Sample Writes Only¶
nodetool toppartitions -a WRITES my_keyspace my_table 10000
Sample CAS Contentions¶
nodetool toppartitions -a CAS_CONTENTIONS my_keyspace my_table 10000
List Active Sampling Sessions¶
nodetool toppartitions -l
Stop Ongoing Sampling¶
nodetool toppartitions -t
Understanding Results¶
Cardinality¶
Cardinality: ~1000
Estimated number of unique partitions accessed during sampling.
Count¶
user_12345 150 10
150: Number of times this partition was accessed10: Statistical margin of error
Hot Partition Indicators¶
| Metric | Warning Sign |
|---|---|
| Single partition >> others | Potential hot partition |
| High count + high error | Variable access pattern |
| Low cardinality + high count | Few partitions handling all traffic |
Use Cases¶
Identify Hot Partitions¶
# Sample during peak traffic
nodetool toppartitions my_keyspace my_table 60000
Hot partitions may indicate: - Data model issues (poor partition key choice) - Application bugs (always accessing same key) - Natural access patterns (celebrity problem)
Performance Troubleshooting¶
# When seeing high latency
nodetool toppartitions -s 20 my_keyspace slow_table 30000
If one partition dominates, investigate that partition.
Capacity Planning¶
# Understand access distribution
nodetool toppartitions -s 50 my_keyspace my_table 300000
Even distribution = good Skewed distribution = potential scaling issue
Sampling Strategies¶
Short Sample (Quick Check)¶
# 10 second sample
nodetool toppartitions my_keyspace my_table 10000
Good for: Quick identification of obvious hot spots
Medium Sample (Typical Analysis)¶
# 1 minute sample
nodetool toppartitions my_keyspace my_table 60000
Good for: Normal troubleshooting
Long Sample (Thorough Analysis)¶
# 5 minute sample
nodetool toppartitions my_keyspace my_table 300000
Good for: Capturing intermittent patterns
Show Top 50 Partitions¶
nodetool toppartitions -k 50 my_keyspace my_table 300000
Interpreting Access Patterns¶
Healthy Distribution¶
Partition Count
part_1 100
part_2 95
part_3 92
part_4 88
...
Traffic distributed relatively evenly.
Hot Partition¶
Partition Count
hot_key 5000
part_2 50
part_3 45
...
One partition receiving 100x more traffic than others.
Write-Heavy Partition¶
WRITES:
hot_key 1000
other 10
READS:
hot_key 50
other 45
Partition is write-heavy—may need data model review.
Addressing Hot Partitions¶
Data Model Solutions¶
-
Add randomization to partition key
-- Instead of CREATE TABLE events (date DATE, event_id UUID, ...); -- Use bucketing CREATE TABLE events (date DATE, bucket INT, event_id UUID, ...); -
Composite partition key
PRIMARY KEY ((user_id, bucket), timestamp)
Application Solutions¶
- Client-side caching - Reduce read frequency
- Write batching - Reduce write frequency
- Load spreading - Distribute across multiple keys
Automation Example¶
#!/bin/bash
# monitor_hot_partitions.sh
KEYSPACE=$1
TABLE=$2
DURATION=60000 # 1 minute
THRESHOLD=100 # Alert if count > 100
result=$(nodetool toppartitions -k 5 $KEYSPACE $TABLE $DURATION 2>/dev/null)
# Parse top partition count
top_count=$(echo "$result" | grep -A2 "Top" | tail -1 | awk '{print $2}')
if [ -n "$top_count" ] && [ "$top_count" -gt "$THRESHOLD" ]; then
echo "ALERT: Hot partition detected in $KEYSPACE.$TABLE"
echo "$result"
fi
Limitations¶
Sampling Limitations
- Results are statistical samples, not exact counts
- Short samples may miss intermittent patterns
- High-traffic tables need longer sampling
- Sampling adds minimal overhead
Related Commands¶
| Command | Relationship |
|---|---|
| profileload | Primary command (toppartitions is an alias) |
| tablestats | Overall table statistics |
| tablehistograms | Latency distributions |
| proxyhistograms | Coordinator latencies |
| tpstats | Thread pool statistics |