Skip to content
Maintained by AxonOps — production-grade documentation from engineers who operate distributed databases at scale

High Memory Usage

High memory usage can lead to OOM kills, long GC pauses, and degraded performance. This playbook covers diagnosis and resolution of memory-related issues.


Symptoms

  • OOM (OutOfMemoryError) in logs
  • Process killed by Linux OOM killer (dmesg | grep -i killed)
  • Long GC pauses (see GC Pause Issues)
  • Swap usage increasing
  • Cassandra process consuming more than expected memory
  • Slow queries during high memory periods

Diagnosis

Step 1: Check Current Memory Usage

# Heap usage
nodetool info | grep -i heap

# Process memory
ps aux | grep cassandra

# System memory
free -h

Step 2: Check for OOM Events

# Linux OOM killer
dmesg | grep -i "killed process\|oom"

# Cassandra OOM errors
grep -i "outofmemory\|heap space" /var/log/cassandra/system.log

Step 3: Analyze Memory Breakdown

# GC stats
nodetool gcstats

# Check off-heap usage (bloom filters, compression metadata)
nodetool info | grep -i "off.heap\|bloom\|compression"

Step 4: Check for Memory-Intensive Operations

# Large partitions being read
grep -i "large partition" /var/log/cassandra/system.log | tail -20

# Compaction activity
nodetool compactionstats

# Streaming activity
nodetool netstats

Step 5: Analyze Heap Dump (if available)

# Generate heap dump
jmap -dump:format=b,file=/tmp/heap.hprof $(pgrep -f CassandraDaemon)

# Analyze with tools like Eclipse MAT or jhat

Resolution

Immediate: Reduce Memory Pressure

# Clear caches
nodetool invalidatekeycache
nodetool invalidaterowcache

# Flush memtables to disk
nodetool flush

# Check GC activity
nodetool gcstats

Concurrent Reads/Writes

Concurrent reads/writes are configured via concurrent_reads and concurrent_writes in cassandra.yaml and require a restart to change. There is no runtime nodetool command to adjust these values.

Short-term: Adjust Memory Settings

Right-size heap:

# In jvm.options
# Generally 8GB max for most workloads
-Xms8G
-Xmx8G

Tune GC:

# For G1GC
-XX:+UseG1GC
-XX:MaxGCPauseMillis=300
-XX:G1HeapRegionSize=16m

Medium-term: Address Root Causes

Cause 1: Large partitions

See Large Partition Issues.

# Find large partitions
grep "large partition" /var/log/cassandra/system.log
nodetool tablestats my_keyspace | grep -i partition

Cause 2: Too many SSTables

# Check SSTable counts
nodetool tablestats my_keyspace | grep -E "Table:|SSTable count"

# Run compaction if needed
nodetool compact my_keyspace my_table

Cause 3: Row cache enabled

# Check row cache
nodetool info | grep -i "row cache"

# Disable if causing issues
ALTER TABLE my_table WITH caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'};

Cause 4: Bloom filter memory

# Check bloom filter size
nodetool tablestats my_keyspace | grep -i "bloom"

# Adjust bloom filter FP chance (higher = less memory)
ALTER TABLE my_table WITH bloom_filter_fp_chance = 0.1;

Cause 5: Concurrent repairs/streaming

# Check active streams
nodetool netstats

# Reduce concurrent repairs
nodetool repair_admin cancel --force

Long-term: Capacity Planning

Calculate required memory:

Total memory needed =
  JVM Heap (8-16GB)
  + Off-heap structures (~1-4GB depending on data size)
  + OS page cache (remaining available RAM)
  + OS overhead (~1GB)

Right-size the node:

Data per node Recommended RAM
< 500 GB 16 GB
500 GB - 1 TB 32 GB
1 TB - 2 TB 64 GB
> 2 TB 64 GB + add nodes

Recovery

After OOM

# Check if node is running
systemctl status cassandra

# If down, start it
sudo systemctl start cassandra

# Monitor startup
tail -f /var/log/cassandra/system.log

# Verify node rejoined cluster
nodetool status

Verify Memory Stability

# Monitor heap usage
watch -n 10 'nodetool info | grep -i heap'

# Watch for GC issues
watch -n 30 'nodetool gcstats'

Memory Configuration Reference

JVM Heap Settings

# jvm.options
-Xms8G                    # Initial heap
-Xmx8G                    # Maximum heap (should equal -Xms)
-XX:+AlwaysPreTouch       # Pre-touch heap pages

Off-Heap Settings

# cassandra.yaml
# Memtable space
memtable_heap_space_in_mb: 2048
memtable_offheap_space_in_mb: 2048

# Native transport
native_transport_max_concurrent_connections: 128

Memory Guidelines

Component Typical Size Notes
Heap 8 GB Rarely benefit from > 16 GB
Memtables 2-4 GB Configured in cassandra.yaml
Bloom filters Varies ~1.25 bytes per key
Compression metadata Varies ~60 bytes per 64KB chunk
Page cache Remaining RAM OS managed

Prevention

  1. Monitor heap usage - Alert at 75% utilization
  2. Set heap limits - Don't let JVM grow unbounded
  3. Avoid large partitions - Design for bounded partition sizes
  4. Disable row cache - Unless specific use case requires it
  5. Regular compaction - Reduce SSTable overhead
  6. Capacity planning - Add nodes before memory becomes critical

Command Purpose
nodetool info Memory usage overview
nodetool gcstats GC statistics
nodetool tablestats Per-table memory usage
nodetool invalidatekeycache Clear key cache
nodetool invalidaterowcache Clear row cache